PDB To CIF¶
Converts one or more protein structures from PDB format to CIF format. Accepts a dictionary of PDB contents and returns a dictionary of converted CIF contents with the same keys. The node uses BioPython for parsing/writing and includes a best-effort fix to map atom records to entity IDs for protein chains.

Usage¶
Use this node when downstream tools or analyses require CIF (mmCIF) format instead of PDB. Typical workflow: load or assemble a batch of PDB structures, convert them to CIF with this node, then feed the CIF outputs into subsequent structure processing, validation, or visualization nodes that expect CIF.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| pdb | True | PDB | Dictionary of PDB structures to convert. Keys are structure IDs (names), values are PDB file contents as strings. | {'example_value': {'my_structure': 'ATOM 1 N MET A 1 11.104 13.207 2.103 1.00 20.00 N ...', 'another_structure': 'ATOM 1 N ALA B 1 5.234 -2.151 8.930 1.00 15.00 N ...'}} |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| structure.cif | CIF | Dictionary of CIF structures converted from the input PDBs. Keys mirror the input keys; values are CIF contents as strings. | {'example_value': {'my_structure': 'data_template\n#\n_loop\n_atom_site.group_PDB _atom_site.id _atom_site.type_symbol ...', 'another_structure': 'data_template\n#\n_loop\n_atom_site.group_PDB _atom_site.id _atom_site.type_symbol ...'}} |
Important Notes¶
- Input format: The input must be a non-empty dictionary mapping structure names to PDB content strings.
- Batch-friendly: Multiple PDBs can be converted in one run; empty entries are skipped.
- Entity mapping: The node attempts to improve atom-to-entity mappings for protein chains in the resulting CIF; this is heuristic and may not adjust all cases.
- Dependencies: Requires BioPython (including PDB and mmCIF modules) to be available in the environment.
- Error handling: If none of the inputs can be converted, the node raises an error.
- Output structure: The output dictionary preserves the original input keys for easy tracking.
Troubleshooting¶
- Error: 'PDB input must be a non-empty dictionary': Ensure you pass a dict like {"name": "
"} and that strings are not empty or whitespace. - Conversion failed for a specific structure: Verify the PDB content is valid and parsable by BioPython; remove nonstandard lines or fix formatting issues.
- Missing or incorrect entity IDs in CIF: The node includes a best-effort fix for protein chains; if issues persist, consider using specialized CIF post-processing tools.
- Environment errors (missing modules): Install BioPython and its PDB/mmCIF I/O components in the runtime environment.
- Partial conversion (some entries missing): Empty or invalid PDB strings are skipped; check logs and ensure each dict value contains valid PDB content.