Python arguments are equivalent to long-option arguments (
--arg
), unless otherwise specified. Flags are True/False arguments in Python. The manual for any gget tool can be called from the command-line using the-h
--help
flag.
gget alphafold 🪢
Predict the 3D structure of a protein from its amino acid sequence using a simplified version of DeepMind’s AlphaFold2 originally released and benchmarked for AlphaFold Colab.
Returns: Predicted structure (PDB) and alignment error (json).
Before using gget alphafold
for the first time:
- Install the third-party dependency openmm v7.5.1 (or v7.7.0 for Python >= 3.10) by running the following command from the command line:
conda install -qy conda==4.13.0 && conda install -qy -c conda-forge openmm=7.5.1
(replace withopenmm=7.7.0
for Python >= 3.10)
Recommendation: Follow withconda update -qy conda
to update conda to the latest version afterwards. - Run
gget setup alphafold
/gget.setup("alphafold")
once (also seegget setup
). Runninggget setup alphafold
/gget.setup("alphafold")
will download and install the latest version of AlphaFold2 hosted on the AlphaFold GitHub Repo. You can rerun this command any time to update the software after a new AlphaFold release.
Positional argument
sequence
Amino acid sequence (str), or list of sequences (gget alphafold will automatically use the multimer model if multiple sequences are passed), or path to FASTA file.
Optional arguments
-mr
--multimer_recycles
The multimer model will continue recycling until the predictions stop changing, up to the limit set here. Default: 3.
For higher accuracy, at the potential cost of longer inference times, set this to 20.
-o
--out
Path to folder to save prediction results in (str). Default: "./[date_time]_gget_alphafold_prediction".
Flags
-mfm
--multimer_for_monomer
Use multimer model for a monomer.
-r
--relax
AMBER relax the best model.
-q
--quiet
Command-line only. Prevents progress information from being displayed.
Python: Use verbose=False
to prevent progress information from being displayed.
plot
Python only. plot=True
provides an interactive, 3D graphical overview of the predicted structure and alignment quality using py3Dmol and matplotlib (default: True).
show_sidechains
Python only. show_sidechains=True
includes side chains in the plot (default: True).
Example
# Generate new prediction from amino acid sequence
gget alphafold MAAHKGAEHHHKAAEHHEQAAKHHHAAAEHHEKGEHEQAAHHADTAYAHHKHAEEHAAQAAKHDAEHHAPKPH
# Find similar sequences deposited on the PDB for comparative analysis
gget blast --database pdbaa MAAHKGAEHHHKAAEHHEQAAKHHHAAAEHHEKGEHEQAAHHADTAYAHHKHAEEHAAQAAKHDAEHHAPKPH
# Fetch the PDB files of similar structures returned by gget blast for comparison, to get a measure for model quality
gget pdb 3UQ3 -o 3UQ3.pdb
gget pdb 2K42 -o 2K42.pdb
# Python
gget.alphafold("MAAHKGAEHHHKAAEHHEQAAKHHHAAAEHHEKGEHEQAAHHADTAYAHHKHAEEHAAQAAKHDAEHHAPKPH")
# Find similar sequences deposited on the PDB for comparative analysis
gget.blast("MAAHKGAEHHHKAAEHHEQAAKHHHAAAEHHEKGEHEQAAHHADTAYAHHKHAEEHAAQAAKHDAEHHAPKPH", database="pdbaa")
# Fetch the PDB files of similar structures returned by gget blast for comparison, to get a measure for model quality
gget.pdb("3UQ3", save=True)
gget.pdb("2K42", save=True)
→ gget alphafold
returns the predicted structure (PDB) and predicted alignment error (.json) in a new folder ("./[date_time]_gget_alphafold_prediction"). The use case above exemplifies how to use gget blast
and gget pdb
for a comparative analysis of the new prediction. PDB files can be viewed interactively in 3D online, or using programs like PyMOL or Blender. To compare two PDB files, you can use this website. The Python interface also returns interactive plots, which can be generated from the PDB and JSON as described in the gget alphafold FAQ Q4.
Tutorials
🔗 Google Colab tutorial
🔗 Protein structure prediction with comparison to related crystal structures
🔗 gget alphafold FAQ
References
If you use gget alphafold
in a publication, please cite the following articles:
-
Luebbert, L., & Pachter, L. (2023). Efficient querying of genomic reference databases with gget. Bioinformatics. https://doi.org/10.1093/bioinformatics/btac836
-
Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021-03819-2
And, if applicable:
- Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv 2021.10.04.463034; https://doi.org/10.1101/2021.10.04.463034