PDB Chain Extractor¶

Extracts specified chains from one or more PDB structures. It retains relevant header/metadata lines and all ATOM/HETATM records for the selected chains, producing a new PDB that contains only those chains. Chain selection is case-insensitive and whitespace-tolerant.

Usage¶

Use this node when you need to isolate specific protein chains from PDB structures for focused analysis or downstream tasks (e.g., per-chain sequence extraction, modeling, or visualization). Connect a PDB dictionary input and specify one or more chain IDs (e.g., A, A,B). The output preserves the original PDB IDs while filtering contents to the requested chains.

Inputs¶

Field	Required	Type	Description	Example
pdb	True	PDB	Dictionary of PDB structures to filter, mapping {pdb_id: pdb_content}. Each value is a PDB text string.	{"1abc": "HEADER ...\nATOM ... A ...\nATOM ... B ...\nEND"}
chains	True	STRING	Comma-separated list of chain IDs to extract. Single-character, alphanumeric chain IDs only. Spaces are ignored and case-insensitive.	A, B

Outputs¶

Field	Type	Description	Example
filtered_pdb	PDB	Dictionary of PDB structures filtered to contain only the specified chains. Keys are the original PDB IDs; values are PDB strings that include the chosen chains and end with an END record.	{"1abc": "HEADER ...\nATOM ... A ...\nATOM ... A ...\nEND"}

Important Notes¶

Chain IDs: Must be single alphanumeric characters (e.g., A, B, 1). Input is case-insensitive and ignores spaces.
Multiple PDBs: Input may contain a batch of PDBs; each will be filtered independently while preserving its original ID.
Metadata lines: Common header and annotation records (e.g., HEADER, TITLE, REMARK, HELIX, SHEET) are preserved. ATOM/HETATM lines are kept only for selected chains.
End of file: The output PDB string includes a terminal END line.
Validation: If no atoms are found for the requested chains in a PDB, the node raises an error listing available chains detected in that PDB.
Empty entries: Empty PDB contents are skipped; if all are empty or invalid, the node raises an error.

Troubleshooting¶

Error: Chains parameter cannot be empty: Provide at least one chain ID, e.g., "A" or "A,B".
Error: Invalid chain ID 'X': Ensure each chain ID is a single alphanumeric character with no extra symbols.
Error: No atoms found for chains [...]: Verify the requested chains exist in the PDB. The error message includes available chains detected; adjust your selection accordingly.
Output is empty or missing expected chains: Confirm the chain list formatting (comma-separated), check for typos, and ensure the original PDB actually contains those chains.
Batch input issues: If some PDBs are empty or malformed, they will be skipped. Ensure at least one valid PDB remains; otherwise, the node will raise an error.