Skip to content

Genie 2

Generates protein backbone structures using the Genie 2 model. Supports two modes: unconditional generation across a range of sequence lengths, and motif scaffolding conditioned on an input PDB and contig layout. Returns the generated structure and, for scaffolding, the extracted motif structure.
Preview

Usage

Use this node to design proteins either from scratch (unconditional mode) or by scaffolding specified motifs within a provided structure (conditional mode). For unconditional generation, leave input_pdb empty and optionally set a range of lengths with min_length, max_length, and length_step. For motif scaffolding, provide input_pdb and a contigs string defining fixed motif segments and flexible regions; length_step is ignored in this mode. Choose the checkpoint (epoch 40 for unconditional, epoch 30 for scaffolding) and adjust scale and batch_size as needed.

Inputs

FieldRequiredTypeDescriptionExample
checkpointTrueCHOICEModel checkpoint to use. 40-epoch is recommended for unconditional generation; 30-epoch for motif scaffolding.base/epoch.40
scaleTrueFLOATSampling noise scale for the generator.0.6
batch_sizeTrueINTNumber of samples processed per batch. Reduce if you run out of GPU memory.1
min_lengthTrueINTMinimum sequence length to consider. Used for both modes; must be <= max_length.100
max_lengthTrueINTMaximum sequence length to consider. Used for both modes; must be >= min_length.100
length_stepTrueINTStep size between sampled sequence lengths in unconditional mode. Ignored in motif scaffolding mode.5
contigsTrueSTRINGDefines scaffold layout for motif scaffolding with fixed motif segments and flexible regions. Use residue ranges like A367-372 and flexible gaps like 10-10. Multiple motifs can be grouped with labels in parentheses (A, B, etc.). Chain breaks are not supported; all segments are sequentially connected.A367-372(B)/10-10/A342-348(A)/10-20/B342-348(A)/10-10/B367-372(B)
seedTrueINTBase seed for deterministic sampling.42
input_pdbFalsePDBInput structure for motif scaffolding. If provided, node performs conditional generation using contigs. Leave empty for unconditional generation.PDB data or reference

Outputs

FieldTypeDescriptionExample
generation.pdbPDBGenerated protein structure. In unconditional mode, a design for each sampled length; in scaffolding mode, the scaffolded design.Generated PDB object
motif.pdbPDBMotif structure extracted from the generation (primarily relevant in scaffolding mode).PDB object of the designed motif

Important Notes

  • For unconditional generation, do not provide input_pdb; contigs will be ignored.
  • For motif scaffolding, both input_pdb and a valid contigs string are required; length_step is not used.
  • Recommended checkpoints: base/epoch.40 for unconditional; base/epoch.30 for scaffolding.
  • min_length must be less than or equal to max_length. For unconditional mode, lengths are sampled from max_length down to min_length in steps of length_step.
  • Contigs format: use fixed motif ranges (e.g., A10-25), flexible gaps (e.g., 5-15), and optional motif groups in parentheses to distinguish segments (e.g., A367-372(B)). Chain breaks are not supported.
  • Batch size impacts GPU memory; lower it if you encounter out-of-memory issues.
  • Generated outputs include metadata such as the seed and configuration for traceability.

Troubleshooting

  • Error: Found 0 lengths between min_length and max_length with step length_step. Fix by ensuring max_length >= min_length and length_step > 0 and small enough to produce at least one length.
  • Error: Expected min_length <= max_length. Set min_length to a value less than or equal to max_length.
  • Error: For motif scaffolding task, contigs must be passed. Provide a valid contigs string when input_pdb is provided.
  • Unexpected: Contigs used in unconditional mode. Note: contigs are ignored when input_pdb is not provided; remove or leave empty to avoid confusion.
  • Results vary run-to-run: Set a fixed seed and keep other parameters constant to get deterministic outputs.
  • Out of memory: Reduce batch_size or the target lengths; try a smaller max_length or increase length_step so fewer lengths are sampled in unconditional mode.