Python arguments are equivalent to long-option arguments (
), unless otherwise specified. Flags are True/False arguments in Python. The manual for any gget tool can be called from the command-line using the-h
gget diamond 💎
Align multiple protein or translated DNA sequences using DIAMOND (DIAMOND is similar to BLAST, but this is a local computation).
Return format: JSON (command-line) or data frame/CSV (Python).
Positional argument
Sequences (str or list) or path to FASTA file containing sequences to be aligned against the reference.
Required arguments
Reference sequences (str or list) or path to FASTA file containing reference sequences.
Add the --translated
flag (Python: translated=True
) if reference sequences are amino acid sequences and query sequences are nucleotide sequences.
Optional arguments
Path to save DIAMOND database created from reference
Default: None -> Temporary db file will be deleted after alignment or saved in out
if out
is provided.
Sensitivity of alignment (str). Default: "very-sensitive".
One of the following: fast, mid-sensitive, sensitive, more-sensitive, very-sensitive, or ultra-sensitive.
Number of threads used (int). Default: 1.
Path to DIAMOND binary (str). Default: None -> Uses DIAMOND binary installed with gget
Path to the folder to save results in (str), e.g. "path/to/directory". Default: Standard out; temporary files are deleted.
Perform translated alignment of nucleotide sequences to amino acid reference sequences.
Command-line only. Returns results in CSV format.
Python: Use json=True
to return output in JSON format.
Command-line only. Prevents progress information from being displayed.
Python: Use verbose=False
to prevent progress information from being displayed.
# !!! Make sure to list the positional argument first here so it is not added as a reference sequence
# Python
→ Returns results in JSON (command-line) or data frame/CSV (Python) format:
query_accession | subject_accession | identity_percentage | query_seq_length | subject_seq_length | length | mismatches | gap_openings | query_start | query_end | subject_start | subject_end | e-value | bit_score |
Seq0 | Seq0 | 100 | 13 | 37 | 13 | 0 | 0 | 1 | 13 | 1 | 13 | 2.82e-09 | 30.8 |
Seq2 | Seq0 | 100 | 12 | 37 | 12 | 0 | 0 | 1 | 12 | 26 | 37 | 4.35e-08 | 27.7 |
Seq3 | Seq1 | 100 | 15 | 15 | 15 | 0 | 0 | 1 | 15 | 1 | 15 | 2.01e-11 | 36.2 |
More examples
If you use gget diamond
in a publication, please cite the following articles:
Luebbert, L., & Pachter, L. (2023). Efficient querying of genomic reference databases with gget. Bioinformatics.
Buchfink, B., Xie, C. & Huson, D. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12, 59–60 (2015).