Alphafold Initial Guess¶

Runs an AlphaFold-based initial refinement using provided sequences and corresponding starting structures. It produces predicted foldings and summary scores (RMSD, PAE, pLDDT) for each input, with controls for residue numbering, amide bond distance constraints, recycle count, and monomer enforcement.

Usage¶

Use this node when you have one or more protein sequences with corresponding initial PDB structures and want AlphaFold to generate an initial guess/refined folding. Typical workflow: prepare FASTA and matching PDBs -> run Alphafold Initial Guess to get refined foldings and quality scores -> optionally post-process with relaxation or downstream analysis.

Inputs¶

Field	Required	Type	Description	Example
fasta	True	FASTA	Protein sequences to process. Multiple sequences can be provided; record IDs must match the structure IDs in the PDB input.	>seq1 MSEQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDW >seq2 GHHHHHHMKTAYIAKQRQISFVKSHFSRQDILD
pdb	True	PDB	Starting structures corresponding to the sequences. If multiple, their IDs must match the FASTA record IDs.	{ "seq1": "ATOM 1 N MET A 1 ...\nEND\n", "seq2": "ATOM 1 N GLY A 1 ...\nEND\n" }
maintain_res_numbering	True	BOOLEAN	If true, prevents renumbering of residues even when inputs contain issues.	false
max_amide_dist	True	FLOAT	Maximum allowed distance (Å) between an amide bond's carbon and nitrogen.	3.0
recycle	True	INT	Number of AlphaFold recycles to perform.	3
force_monomer	True	BOOLEAN	If true, forces prediction in monomer mode.	false
debug	True	BOOLEAN	If true, errors will halt execution; if false, the process attempts to continue even if some poses fail.	true

Outputs¶

Field	Type	Description	Example
folding.pdb	PDB	Predicted foldings for each input, returned as a mapping of IDs to PDB content.	{ "seq1_rank_1": "ATOM ...\nEND\n", "seq2_rank_1": "ATOM ...\nEND\n" }
score.sc	STRING	Scores per design, including metrics such as RMSD, PAE, and pLDDT. Returned as a serialized string (e.g., JSON or tabular text) keyed by input IDs.	{"seq1": {"rmsd": 1.8, "pae": 7.2, "plddt": 85.6}, "seq2": {"rmsd": 2.3, "pae": 8.1, "plddt": 82.4}}

Important Notes¶

ID matching: FASTA record IDs must exactly match the IDs of the input PDB structures; mismatches will cause errors.
Batch behavior: Multiple sequences are supported; runtime scales with the number of inputs.
Residue numbering: Enable maintain_res_numbering to preserve original numbering in challenging inputs.
Quality constraints: max_amide_dist constrains amide geometry; tighten or relax as needed based on input quality.
Recycle trade-off: Higher recycle values can improve accuracy but increase computation time.
Monomer mode: Use force_monomer for single-chain predictions; disable when working with complex inputs that require multimeric context.

Troubleshooting¶

Mismatched IDs between FASTA and PDB: Ensure the FASTA headers and PDB mapping keys are identical (e.g., seq1 in both).
Geometry or amide bond errors: Increase max_amide_dist or enable maintain_res_numbering to avoid renumbering and reduce failures.
Long runtimes or timeouts: Reduce recycle, process fewer sequences at a time, or verify service availability.
Unexpected renumbering: Set maintain_res_numbering to true to preserve original residue indices.
Low confidence (low pLDDT/high PAE): Provide higher-quality starting structures, adjust recycle upward, or check that the input sequence matches the structure.