Skip to content

OpenMM PDB Fixer

This node cleans and repairs input PDB structure data so it can be used reliably in downstream OpenMM solvation, minimization, and simulation steps. It adds a controlled preprocessing step for real-world structures with missing atoms, non-standard residues, alternate locations, and protonation-state issues, while giving explicit control over how ligands, cofactors, metals, and waters are handled.
Preview

Usage

Use this node early in a molecular simulation workflow whenever your starting structure comes from an experimental PDB file or any source that may contain incomplete or non-standard topology. It is especially useful for multi-chain proteins, antibodies, structures with chain breaks, missing terminal atoms, alternate conformations, or modified residues such as MSE that often cause downstream force-field setup to fail.

Typical workflow position: connect a structure-producing node such as a PDB loader into OpenMM PDB Fixer, then send its fixed.pdb output into OpenMM Solvate. From there, the usual sequence is OpenMM ForceField ConfigOpenMM SolvateOpenMM Energy MinimizeOpenMM Simulate. If your structure contains ligands or other non-water heterogens that must be preserved, pair this node with OpenMM Ligand Parameters and use heterogen_mode set to preserve so downstream OpenMM nodes can parameterize those residues correctly.

Best practices: keep the default error_if_present mode unless you are certain you want to discard heterogens or you already have ligand parameterization prepared. Use replace_nonstandard_residues=true for most crystal structures to improve compatibility with standard protein force fields. Leave keep_water=false in most cases, because explicit solvent is usually added later by OpenMM Solvate and input crystallographic waters are not typically needed for general-purpose simulation pipelines.

Inputs

FieldRequiredTypeDescriptionExample
pdbTruePDBOne or more PDB structures to repair. The node expects a PDB collection keyed by structure ID, and processes each entry individually.{"7KVE":"HEADER ANTIBODY COMPLEX ...\nATOM ...\nEND"}
pHFalseFLOATpH value used when assigning protonation states during hydrogen addition. Valid range is 0.0 to 14.0; physiological workflows typically use 7.4.7.4
heterogen_modeFalseSTRINGControls treatment of non-water HETATM residues. `error_if_present` stops execution if ligands, cofactors, or metals are present; `remove_nonwater` strips non-water heterogens; `preserve` keeps them for downstream ligand-aware workflows.preserve
keep_waterFalseBOOLEANWhether to retain water molecules from the input structure. This setting only matters when `heterogen_mode` is `remove_nonwater`. In most explicit-solvent workflows, input waters are not needed because solvent is added later.false
replace_nonstandard_residuesFalseBOOLEANWhether to convert non-standard residues to their canonical equivalents where possible, such as MSE to MET. This improves compatibility with standard force fields.true
timeoutFalseINTMaximum wait time in seconds for repairing each input PDB. Valid range is 60 to 3600 seconds; applied per structure in a batch.600

Outputs

FieldTypeDescriptionExample
fixed.pdbPDBRepaired PDB structure collection ready for downstream solvation, minimization, or simulation. Each output entry contains cleaned coordinates and topology text for the corresponding input structure.{"7KVE":"HEADER REPAIRED STRUCTURE ...\nATOM ...\nEND"}
statisticsDATAFRAMETabular summary of repaired structures with one row per input PDB. Includes at least `pdb_id` and `num_atoms` so downstream steps can verify successful repair and structure size.[{"pdb_id":"7KVE","num_atoms":18432}]

Important Notes

  • Behavior: The default heterogen_mode is error_if_present, which intentionally prevents silent removal of ligands, cofactors, or metals. This protects simulation workflows from accidentally discarding chemically important molecules.
  • Integration: If you choose preserve, you should typically use OpenMM Ligand Parameters before OpenMM Solvate, OpenMM Energy Minimize, or OpenMM Simulate so preserved ligand residues can be parameterized downstream.
  • Workflow: keep_water only affects the remove_nonwater path. It does not change behavior when heterogens are preserved or when execution stops on heterogen detection.
  • Performance: Repair is generally fast for protein-only structures, but batch inputs are processed per PDB and the timeout applies to each one individually, so large batches can take proportionally longer.

Troubleshooting

  • Force-field setup fails later with template or atom-mismatch errors: Run the structure through this node before solvation or minimization, and keep replace_nonstandard_residues enabled so modified residues are normalized where possible.
  • Node stops because heterogens are present: This is expected when heterogen_mode is error_if_present. Switch to preserve if the ligand or cofactor must remain, then provide matching residue definitions using OpenMM Ligand Parameters; or use remove_nonwater for protein-only simulations.
  • Unexpected waters remain or disappear: Check the combination of heterogen_mode and keep_water. keep_water only applies when using remove_nonwater, and most workflows should still add a fresh solvent box later with OpenMM Solvate.
  • Repair times out on a large or unusual structure: Increase the timeout value, especially for very large complexes or problematic input files processed in batches.