Colabfold Search¶

Runs multiple sequence alignment (MSA) search using the ColabFold pipeline for one or more protein sequences provided in FASTA format, returning alignments in A3M format mapped by sequence IDs. Currently, this node is disabled and will raise an error if used.

Usage¶

Use this node to obtain A3M alignments via ColabFold for downstream protein structure prediction steps (e.g., Alphafold). Provide one or more sequences in FASTA; each record will be searched and returned as an A3M entry keyed by the FASTA header. Note: As of now, this node is disabled. Use the MSA Search node instead to perform MSA retrieval.

Inputs¶

Field	Required	Type	Description	Example
fasta	True	FASTA	Protein sequence(s) in FASTA format to run ColabFold MSA search on. Supports multiple records; each header will become the key in the output.	>seq1 MKTAYIAKQRQISFVKSHFSRQDILD >seq2 GHHHHHHENLYFQGAMASMTGGQQMGRGS

Outputs¶

Field	Type	Description	Example
msa.a3m	A3M	A mapping of sequence IDs to their ColabFold MSA results in A3M format.	{"seq1": ">seq1\nMKTAYIAKQRQISFVKSHFSRQDILD\n>hit1\nMKTA-IAKQRQISFVKSHFSRqdILD", "seq2": ">seq2\nGHHHHHHENLYFQGAMASMTGGQQMGRGS\n>hitA\nGHHHHHHENLYFQGAMAS-TGGQQMGRGS"}

Important Notes¶

Disabled: This node is currently disabled and will raise a 'NotImplementedError' if executed.
Input format: Provide valid FASTA with unique headers; multiple sequences are supported and processed individually.
Output mapping: Output keys mirror the input FASTA headers; each value is an A3M string.
Performance: When enabled, MSA searches can be time-consuming; long or many sequences may take a long time per entry.
Alternative: Use the 'MSA Search' node as a functional alternative for obtaining A3M alignments.

Troubleshooting¶

NotImplementedError raised: The node is disabled. Switch to the 'MSA Search' node or contact support to request ColabFold search enablement.
Empty or invalid output: Ensure the input FASTA is valid (proper headers, only amino acid characters) and that headers are unique.
Large runtimes or timeouts: Reduce sequence length or batch size; consider using a reduced or toy dataset via alternative nodes if available.
Mismatched IDs downstream: Keep FASTA headers consistent across workflow steps; output A3M keys match the input FASTA headers.