PDB Chain Extractor¶
Extracts specific chain(s) from one or more PDB structures and returns new PDBs containing only those chains. It preserves relevant header and metadata records and includes only ATOM/HETATM records for the requested chains, appending an END record.

Usage¶
Use this node when you need to subset PDB structures by chain, for example to isolate a ligand-bound chain or to prepare inputs for downstream analysis that should include only selected chains. Provide a dictionary of PDBs (e.g., from a loader or combiner node) and a comma-separated list of chain IDs to keep.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| pdb | True | PDB | Dictionary of PDB structures to process, mapping identifiers to PDB text content (e.g., {"structure1": "...PDB text..."}). Each entry will be filtered to include only the specified chains. | {'1ABC': 'HEADER ...\\nATOM ... A ...\\nATOM ... B ...\\nEND'} |
| chains | True | STRING | Comma-separated list of chain IDs to extract. Accepts formats like "A", "A,B", "A, B", or "A, B". Chain IDs must be single alphanumeric characters and are matched case-insensitively. | A,B |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| filtered_pdb | PDB | Dictionary of filtered PDB structures containing only the specified chains for each input entry. Keys mirror the input identifiers. | {'1ABC': 'HEADER ...\\nATOM ... A ...\\nATOM ... A ...\\nEND'} |
Important Notes¶
- Chain ID validation: Each chain ID must be a single alphanumeric character; invalid entries will cause an error.
- Case-insensitive matching: Chain IDs are matched case-insensitively (converted to uppercase internally).
- Metadata retention: Header and metadata records (e.g., HEADER, TITLE, REMARK, SEQRES, CRYST1) and MODEL lines are preserved; only ATOM/HETATM lines are filtered by chain.
- END record: An END record is appended to the output PDB content.
- Strict per-entry validation: If any input PDB contains none of the requested chains (no ATOM/HETATM lines match), the node raises an error and lists available chains in that PDB.
- Empty inputs skipped: PDB entries with empty content are skipped with a warning; if all are empty or invalid, the node errors.
Troubleshooting¶
- Chains parameter cannot be empty: Provide at least one chain ID, e.g., "A" or "A,B".
- Invalid chain ID '
' : Ensure each chain ID is a single alphanumeric character (e.g., "A", "1"). - No atoms found for chains [...]: The specified chains do not exist in the PDB entry. Use the listed 'Available chains' from the error message and update the 'chains' input accordingly.
- No valid PDB structures could be processed: All inputs were empty or invalid. Verify the 'pdb' input dictionary contains non-empty PDB text.
- Unexpected output structure: The node returns a dictionary of filtered PDBs keyed by the original identifiers. Ensure downstream nodes expect a PDB dictionary.
- Whitespace in chains: Spaces are allowed and ignored around commas; ensure the list uses commas to separate chain IDs.