Skip to content

Alphafold Initial Guess

Runs an AlphaFold-based initial refinement using provided sequences and corresponding starting structures. It produces predicted foldings and summary scores (RMSD, PAE, pLDDT) for each input, with controls for residue numbering, amide bond distance constraints, recycle count, and monomer enforcement.
Preview

Usage

Use this node when you have one or more protein sequences with corresponding initial PDB structures and want AlphaFold to generate an initial guess/refined folding. Typical workflow: prepare FASTA and matching PDBs -> run Alphafold Initial Guess to get refined foldings and quality scores -> optionally post-process with relaxation or downstream analysis.

Inputs

FieldRequiredTypeDescriptionExample
fastaTrueFASTAProtein sequences to process. Multiple sequences can be provided; record IDs must match the structure IDs in the PDB input.>seq1 MSEQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDW >seq2 GHHHHHHMKTAYIAKQRQISFVKSHFSRQDILD
pdbTruePDBStarting structures corresponding to the sequences. If multiple, their IDs must match the FASTA record IDs.{ "seq1": "ATOM 1 N MET A 1 ...\nEND\n", "seq2": "ATOM 1 N GLY A 1 ...\nEND\n" }
maintain_res_numberingTrueBOOLEANIf true, prevents renumbering of residues even when inputs contain issues.false
max_amide_distTrueFLOATMaximum allowed distance (Å) between an amide bond's carbon and nitrogen.3.0
recycleTrueINTNumber of AlphaFold recycles to perform.3
force_monomerTrueBOOLEANIf true, forces prediction in monomer mode.false
debugTrueBOOLEANIf true, errors will halt execution; if false, the process attempts to continue even if some poses fail.true

Outputs

FieldTypeDescriptionExample
folding.pdbPDBPredicted foldings for each input, returned as a mapping of IDs to PDB content.{ "seq1_rank_1": "ATOM ...\nEND\n", "seq2_rank_1": "ATOM ...\nEND\n" }
score.scSTRINGScores per design, including metrics such as RMSD, PAE, and pLDDT. Returned as a serialized string (e.g., JSON or tabular text) keyed by input IDs.{"seq1": {"rmsd": 1.8, "pae": 7.2, "plddt": 85.6}, "seq2": {"rmsd": 2.3, "pae": 8.1, "plddt": 82.4}}

Important Notes

  • ID matching: FASTA record IDs must exactly match the IDs of the input PDB structures; mismatches will cause errors.
  • Batch behavior: Multiple sequences are supported; runtime scales with the number of inputs.
  • Residue numbering: Enable maintain_res_numbering to preserve original numbering in challenging inputs.
  • Quality constraints: max_amide_dist constrains amide geometry; tighten or relax as needed based on input quality.
  • Recycle trade-off: Higher recycle values can improve accuracy but increase computation time.
  • Monomer mode: Use force_monomer for single-chain predictions; disable when working with complex inputs that require multimeric context.

Troubleshooting

  • Mismatched IDs between FASTA and PDB: Ensure the FASTA headers and PDB mapping keys are identical (e.g., seq1 in both).
  • Geometry or amide bond errors: Increase max_amide_dist or enable maintain_res_numbering to avoid renumbering and reduce failures.
  • Long runtimes or timeouts: Reduce recycle, process fewer sequences at a time, or verify service availability.
  • Unexpected renumbering: Set maintain_res_numbering to true to preserve original residue indices.
  • Low confidence (low pLDDT/high PAE): Provide higher-quality starting structures, adjust recycle upward, or check that the input sequence matches the structure.