Skip to content

Alphafold

Predicts protein 3D structures from multiple sequence alignments (A3M) using an Alphafold service. Supports monomer model presets and optional template search, with deterministic seeding and per-sequence batching. Multimer and relaxation are currently not supported.
Preview

Usage

Use this node after generating MSAs. Typical workflow: 1) Run MSA Search to produce A3M alignments for your sequences, 2) Feed the resulting A3M dict into Alphafold to obtain ranked PDB structures. Choose TEST mode for quick sanity checks, and PROD for full runs.

Inputs

FieldRequiredTypeDescriptionExample
a3mTrueA3MMultiple sequence alignments in A3M format, provided as a dict mapping sequence IDs to A3M content. Keys represent sequence IDs; values are the corresponding A3M strings.{"seq1": ">seq1\nMKT...\n>homolog1\nMKV...\n", "seq2": ">seq2\nGAA..."}
search_templatesTrueBOOLEANWhether to perform template search prior to prediction. Can improve accuracy but increases runtime.false
model_presetTrueSTRINGAlphafold model configuration to use. Supported: monomer, monomer_ptm, monomer_casp14. The value "multimer" is listed but not supported and will error.monomer_ptm
models_to_relaxTrueSTRINGWhen to run relaxation (NONE, BEST, ALL). Currently not supported; must be NONE.NONE
enable_gpu_relaxTrueBOOLEANWhether relaxation runs on GPU or CPU. Has no effect currently since relaxation is not supported.true
skip_modelsTrueSTRINGComma-separated list of Alphafold model indices (1–5) to skip during prediction. At least one model must remain. Example: "2,5" to skip models 2 and 5.2,4
seedTrueINTBase random seed. For multiple sequences, each subsequent sequence uses an incremented seed (seed + index).42
modeTrueSTRINGRun mode: MOCK (returns predefined results), PROD (calls the service), TEST (uses minimal parameters for quick checks). In TEST mode, template search is disabled and only model 1 is used.PROD

Outputs

FieldTypeDescriptionExample
folding.pdbPDBRanked predicted structures for each input sequence as a dict. Keys are composed of the input sequence ID and model/rank identifiers; values are PDB file contents.{"seq1_model_1_rank_1": "ATOM ...", "seq1_model_1_rank_2": "ATOM ..."}

Important Notes

  • Model limitations: Multimer model is not supported and will raise an error if selected.
  • Relaxation unsupported: models_to_relax must be NONE; BEST and ALL are not currently available.
  • skip_models validation: Values must be integers 1–5 and cannot skip all five models; at least one model must run.
  • TEST mode behavior: Automatically disables template search and runs only model 1 (equivalent to skipping models 2–5).
  • Input consistency: The A3M dict keys (sequence IDs) must match the extracted FASTA IDs; mismatches will error.
  • Seeding per sequence: For multiple sequences, the seed used is incremented per sequence (seed + index), aiding reproducibility.
  • Runtime considerations: Longer sequences and enabling template search increase runtime and compute requirements.

Troubleshooting

  • Error: Multimer model is not supported: Set model_preset to a supported monomer variant (monomer, monomer_ptm, or monomer_casp14).
  • Error: Relaxation is not supported: Ensure models_to_relax is set to NONE.
  • Error: Expected models indices in skip_models to be in range [1, 5]: Provide a comma-separated list with values from 1 to 5 only.
  • Error: Cannot skip all 5 models: Remove at least one index from skip_models so at least one model runs.
  • Error: Expected FASTA IDs to match features IDs: Ensure A3M keys correctly correspond to the sequence IDs; do not rename keys between steps.
  • Unexpectedly fast results in MOCK mode: Verify mode is set to PROD for real predictions.
  • Long queue or timeout feelings: Reduce sequence length, disable template search, or use TEST mode to validate pipeline quickly before full runs.