Boltz-2 Virtual Screening
Performs high-throughput virtual screening of small molecules against a target protein using Boltz-2, predicting binding affinity and ranking top candidates for drug discovery.
Quick Start
- Prepare the target protein FASTA sequence.
- List ligand SMILES strings (one per line).
- (Recommended) Provide a protein MSA (A3M format) for better accuracy.
- Set screening parameters as needed.
- Run the node to obtain ranked screening results.
Setup Guide
- Obtain the target protein FASTA sequence.
- Collect ligand SMILES strings (one per line).
- (Optional) Prepare ligand names and protein MSA (A3M).
- Adjust parameters such as top_k, affinity threshold, and batch size as needed.
- Enable or disable fast mode for speed/accuracy tradeoff.
3. Run and Review Results
- Execute the node.
- Review the output JSON for ranked hits and predicted affinities.
Basic Usage
Virtual Screening Workflow
- Screen a library of small molecules against a protein target.
- Filter hits by predicted affinity and binding probability.
- Retrieve top-ranked ligands for further analysis.
Configuration
Field |
Description |
Type |
Example |
protein_fasta |
Target protein FASTA sequence |
FASTA |
">A\nMKT..." |
ligand_smiles_list |
SMILES strings (one per line) |
STRING |
"CCO\nC1=CC=CC=C1" |
seed |
Random seed for reproducibility |
INT |
42 |
Field |
Description |
Type |
Example |
top_k |
Number of top hits to return |
INT |
10 |
affinity_threshold |
Affinity cutoff (log IC50) |
FLOAT |
-5.0 |
binding_prob_threshold |
Binding probability cutoff |
FLOAT |
0.7 |
ligand_names |
Ligand names (one per line, matches SMILES) |
STRING |
"Aspirin\nIbuprofen" |
protein_msa |
Protein MSA (A3M format, recommended) |
A3M |
{"A": "..."} |
batch_size |
Number of ligands processed per batch |
INT |
5 |
fast_mode |
Use faster settings for screening |
BOOLEAN |
True |
Outputs
Field |
Description |
Example |
screening_results.json |
Ranked screening results with affinities, scores |
{"hits": [...]} |
Best Practices
- Always provide a high-quality protein FASTA sequence.
- Use realistic, valid SMILES strings for ligands.
- Supplying a protein MSA (A3M) significantly improves prediction quality.
Screening Parameters
- Adjust
top_k
and thresholds to balance hit rate and specificity.
- Use
fast_mode
for rapid initial screening; disable for higher accuracy.
Troubleshooting
Common Issues
- Missing or invalid FASTA/SMILES: Ensure all required fields are filled and formatted correctly.
- No hits returned: Loosen affinity or probability thresholds, or check input quality.
- Mismatch in ligand names and SMILES: Provide the same number of names as SMILES entries.
Need Help?
- Contact support for further assistance.