Boltz Partial Diffusion¶

Runs a Boltz-guided partial diffusion prediction using provided reference structures. Supports single-chain (simple) and multichain (multimer) systems, can optionally fix specific chains, and can emit diffusion trajectories for analysis. If affinity properties are present in the YAML, it also computes affinity-related outputs.

Usage¶

Use this node when you want to refine or remodel structures starting from reference PDBs rather than generating from scratch. Typical workflow: build sequences/properties with Boltz builders, combine them into a Boltz YAML and auxiliary files, then pass those along with one or more reference PDB structures into this node to perform partial diffusion. Adjust the partial_diffusion_fraction to control how much the model deviates from the reference, and optionally fix chains that should remain static.

Inputs¶

Field	Required	Type	Description	Example
boltz_yaml	True	BOLTZ_YAML	Boltz configuration dictionary produced upstream (e.g., from a YAML combiner). Must include at least one sequence and may include properties (such as affinity).	{'sequences': [{'protein': {'id': 'A', 'sequence': 'MSSSS...END'}}], 'properties': [{'affinity': {'target': 'ligand1'}}]}
boltz_files	True	BOLTZ_FILES	Auxiliary files referenced by the Boltz YAML (e.g., templates, constraints, ligands).	{'templates': {'template1.pdb': ''}, 'ligands': {'ligand1.sdf': ''}}
reference_structures	True	PDB	Reference structure(s) in PDB format to guide partial diffusion. Provide a dictionary mapping names to PDB content. Chain IDs should align with those specified in the Boltz YAML sequences.	{'ref_model_1.pdb': 'HEADER ...\\nATOM ...\\nEND', 'ref_model_2.pdb': 'HEADER ...\\nATOM ...\\nEND'}
seed	True	INT	Random seed. If multiple sequences are present, each is offset by index to ensure unique runs.	42
partial_diffusion_fraction	False	FLOAT	Fraction of the diffusion process to perform relative to the reference (0.0 = no change from reference, 1.0 = full diffusion).	0.25
save_trajectory	False	BOOLEAN	If true, saves intermediate diffusion trajectories for analysis.	False
fixed_chains	False	STRING	Comma-separated chain IDs to keep fixed (e.g., "A,B"). Must match chain IDs in the Boltz YAML sequences. Leave empty to allow all chains to move.	A,B
recycling_steps	False	INT	Number of recycling iterations.	3
sampling_steps	False	INT	Number of sampling steps during diffusion.	200
diffusion_samples	False	INT	Number of diffusion samples to generate per sequence.	1
max_parallel_samples	False	INT	Maximum number of samples to process in parallel.	5
step_scale	False	FLOAT	Diffusion step scale (temperature-like parameter).	1.638
output_format	False	ENUM[pdb, mmcif]	Format for output structures.	pdb
num_workers	False	INT	Number of worker processes to use (0 disables multiprocessing).	0
use_potentials	False	BOOLEAN	Use inference-time potentials. Recommended for partial diffusion.	True
write_full_pae	False	BOOLEAN	If true, writes the full PAE matrix.	False
write_full_pde	False	BOOLEAN	If true, writes the full PDE matrix.	False
max_msa_seqs	False	INT	Maximum number of MSA sequences to use.	8192
subsample_msa	False	BOOLEAN	If true, subsamples the MSA.	False
num_subsampled_msa	False	INT	Number of MSA sequences to retain when subsampling.	1024
affinity_mw_correction	False	BOOLEAN	Apply molecular weight correction when computing affinity (only used if affinity properties are present).	False
sampling_steps_affinity	False	INT	Sampling steps to use for affinity prediction (only used if affinity properties are present).	200
diffusion_samples_affinity	False	INT	Number of diffusion samples for affinity prediction (only used if affinity properties are present).	5

Outputs¶

Field	Type	Description	Example
structures.pdb	PDB	Predicted structure files in the chosen format, returned as a dictionary mapping names to file content.	{'seqA_ranked_0.pdb': 'HEADER ...\\nATOM ...\\nEND', 'seqA_ranked_1.pdb': 'HEADER ...\\nATOM ...\\nEND'}
confidence.json	JSON	Confidence metrics for the predictions (e.g., per-model scores and matrices), keyed by sequence/model.	{'seqA_plddt.json': {'mean_plddt': 84.1}, 'seqA_pae.json': {'shape': [200, 200]}}
trajectory.pdb	PDB	Diffusion trajectory snapshots in PDB format when trajectory saving is enabled. May be empty if save_trajectory is false.	{'seqA_traj_step_050.pdb': 'ATOM ...', 'seqA_traj_step_200.pdb': 'ATOM ...'}
affinity.json	JSON	Affinity-related outputs if affinity properties were requested; otherwise returned as an empty object.	{'seqA_affinity.json': {'predicted_kd': 120.5, 'units': 'nM'}}

Important Notes¶

Input requirements: boltz_yaml must contain at least one sequence, boltz_files must be a dictionary, and reference_structures must be a non-empty dictionary of valid PDB content.
Partial diffusion control: partial_diffusion_fraction regulates deviation from the reference; lower values keep structures closer to the supplied PDB(s).
Chain handling: fixed_chains must match chain IDs defined in the Boltz YAML sequences. Invalid IDs will cause an error.
Mode detection: The node automatically treats the job as simple (single chain) or multimer (multiple chains) based on chain IDs in the YAML.
Trajectories: trajectory outputs are only populated if save_trajectory is true; enabling this may increase runtime and memory usage.
Affinity outputs: Affinity parameters are only applied when affinity properties are present in the boltz_yaml; otherwise affinity inputs are ignored and the affinity output will be empty.
Performance: num_workers controls multiprocessing; increasing it may speed up processing but requires more system resources.
MSA controls: If subsample_msa is true, only num_subsampled_msa sequences will be used from the MSA up to max_msa_seqs.

Troubleshooting¶

Error: 'Reference structures are required': Ensure reference_structures is a non-empty dictionary and contains valid PDB text.
Error: 'Fixed chains ... not found': Verify fixed_chains matches chain IDs in the YAML sequences (e.g., A,B). Remove spaces or typos and ensure YAML defines those chain IDs.
No affinity output produced: Confirm boltz_yaml includes an affinity property. Without it, affinity-related settings are ignored and the affinity output will be empty.
Empty trajectory output: Set save_trajectory to true to generate trajectory snapshots.
Outputs missing expected models: Increase diffusion_samples or sampling_steps, and check that partial_diffusion_fraction and step_scale are appropriate for the desired degree of remodeling.
MSA too large or slow: Lower max_msa_seqs or enable subsample_msa and adjust num_subsampled_msa to reduce compute.