Skip to content

Boltz Partial Diffusion

Runs a Boltz-guided partial diffusion prediction using provided reference structures. Supports single-chain (simple) and multichain (multimer) systems, can optionally fix specific chains, and can emit diffusion trajectories for analysis. If affinity properties are present in the YAML, it also computes affinity-related outputs.
Preview

Usage

Use this node when you want to refine or remodel structures starting from reference PDBs rather than generating from scratch. Typical workflow: build sequences/properties with Boltz builders, combine them into a Boltz YAML and auxiliary files, then pass those along with one or more reference PDB structures into this node to perform partial diffusion. Adjust the partial_diffusion_fraction to control how much the model deviates from the reference, and optionally fix chains that should remain static.

Inputs

FieldRequiredTypeDescriptionExample
boltz_yamlTrueBOLTZ_YAMLBoltz configuration dictionary produced upstream (e.g., from a YAML combiner). Must include at least one sequence and may include properties (such as affinity).{'sequences': [{'protein': {'id': 'A', 'sequence': 'MSSSS...END'}}], 'properties': [{'affinity': {'target': 'ligand1'}}]}
boltz_filesTrueBOLTZ_FILESAuxiliary files referenced by the Boltz YAML (e.g., templates, constraints, ligands).{'templates': {'template1.pdb': ''}, 'ligands': {'ligand1.sdf': ''}}
reference_structuresTruePDBReference structure(s) in PDB format to guide partial diffusion. Provide a dictionary mapping names to PDB content. Chain IDs should align with those specified in the Boltz YAML sequences.{'ref_model_1.pdb': 'HEADER ...\\nATOM ...\\nEND', 'ref_model_2.pdb': 'HEADER ...\\nATOM ...\\nEND'}
seedTrueINTRandom seed. If multiple sequences are present, each is offset by index to ensure unique runs.42
partial_diffusion_fractionFalseFLOATFraction of the diffusion process to perform relative to the reference (0.0 = no change from reference, 1.0 = full diffusion).0.25
save_trajectoryFalseBOOLEANIf true, saves intermediate diffusion trajectories for analysis.False
fixed_chainsFalseSTRINGComma-separated chain IDs to keep fixed (e.g., "A,B"). Must match chain IDs in the Boltz YAML sequences. Leave empty to allow all chains to move.A,B
recycling_stepsFalseINTNumber of recycling iterations.3
sampling_stepsFalseINTNumber of sampling steps during diffusion.200
diffusion_samplesFalseINTNumber of diffusion samples to generate per sequence.1
max_parallel_samplesFalseINTMaximum number of samples to process in parallel.5
step_scaleFalseFLOATDiffusion step scale (temperature-like parameter).1.638
output_formatFalseENUM[pdb, mmcif]Format for output structures.pdb
num_workersFalseINTNumber of worker processes to use (0 disables multiprocessing).0
use_potentialsFalseBOOLEANUse inference-time potentials. Recommended for partial diffusion.True
write_full_paeFalseBOOLEANIf true, writes the full PAE matrix.False
write_full_pdeFalseBOOLEANIf true, writes the full PDE matrix.False
max_msa_seqsFalseINTMaximum number of MSA sequences to use.8192
subsample_msaFalseBOOLEANIf true, subsamples the MSA.False
num_subsampled_msaFalseINTNumber of MSA sequences to retain when subsampling.1024
affinity_mw_correctionFalseBOOLEANApply molecular weight correction when computing affinity (only used if affinity properties are present).False
sampling_steps_affinityFalseINTSampling steps to use for affinity prediction (only used if affinity properties are present).200
diffusion_samples_affinityFalseINTNumber of diffusion samples for affinity prediction (only used if affinity properties are present).5

Outputs

FieldTypeDescriptionExample
structures.pdbPDBPredicted structure files in the chosen format, returned as a dictionary mapping names to file content.{'seqA_ranked_0.pdb': 'HEADER ...\\nATOM ...\\nEND', 'seqA_ranked_1.pdb': 'HEADER ...\\nATOM ...\\nEND'}
confidence.jsonJSONConfidence metrics for the predictions (e.g., per-model scores and matrices), keyed by sequence/model.{'seqA_plddt.json': {'mean_plddt': 84.1}, 'seqA_pae.json': {'shape': [200, 200]}}
trajectory.pdbPDBDiffusion trajectory snapshots in PDB format when trajectory saving is enabled. May be empty if save_trajectory is false.{'seqA_traj_step_050.pdb': 'ATOM ...', 'seqA_traj_step_200.pdb': 'ATOM ...'}
affinity.jsonJSONAffinity-related outputs if affinity properties were requested; otherwise returned as an empty object.{'seqA_affinity.json': {'predicted_kd': 120.5, 'units': 'nM'}}

Important Notes

  • Input requirements: boltz_yaml must contain at least one sequence, boltz_files must be a dictionary, and reference_structures must be a non-empty dictionary of valid PDB content.
  • Partial diffusion control: partial_diffusion_fraction regulates deviation from the reference; lower values keep structures closer to the supplied PDB(s).
  • Chain handling: fixed_chains must match chain IDs defined in the Boltz YAML sequences. Invalid IDs will cause an error.
  • Mode detection: The node automatically treats the job as simple (single chain) or multimer (multiple chains) based on chain IDs in the YAML.
  • Trajectories: trajectory outputs are only populated if save_trajectory is true; enabling this may increase runtime and memory usage.
  • Affinity outputs: Affinity parameters are only applied when affinity properties are present in the boltz_yaml; otherwise affinity inputs are ignored and the affinity output will be empty.
  • Performance: num_workers controls multiprocessing; increasing it may speed up processing but requires more system resources.
  • MSA controls: If subsample_msa is true, only num_subsampled_msa sequences will be used from the MSA up to max_msa_seqs.

Troubleshooting

  • Error: 'Reference structures are required': Ensure reference_structures is a non-empty dictionary and contains valid PDB text.
  • Error: 'Fixed chains ... not found': Verify fixed_chains matches chain IDs in the YAML sequences (e.g., A,B). Remove spaces or typos and ensure YAML defines those chain IDs.
  • No affinity output produced: Confirm boltz_yaml includes an affinity property. Without it, affinity-related settings are ignored and the affinity output will be empty.
  • Empty trajectory output: Set save_trajectory to true to generate trajectory snapshots.
  • Outputs missing expected models: Increase diffusion_samples or sampling_steps, and check that partial_diffusion_fraction and step_scale are appropriate for the desired degree of remodeling.
  • MSA too large or slow: Lower max_msa_seqs or enable subsample_msa and adjust num_subsampled_msa to reduce compute.