Genome Annotation

DFAST

DFAST - is a very quick prokaryotic genome annotation pipeline providing rich information on pseudogenes, translation exceptions and orthologous gene assignment between given reference genomes. DFAST also supports genome submission to public sequence databases
(Reference: Tanizawa Y et al. (2018) Bioinformatics. 34(6): 1037-1039).
One of my favourite annotation pipelines due to its speed and simplicity.


Bakta web server

Bakta web server - is a user-friendly web interface for conducting and visualizing annotations using Bakta without requiring command line expertise or local computing resources. Key features include interactive visualizations through circular genome plots, linear genome browsers, and searchable data tables facilitating the interpretation of complex annotation results. The web server generates standard bioinformatics outputs (GFF3, GenBank, EMBL) and annotates diverse genomic features, including coding sequences, non-coding RNAs, small open reading frames (sORFs)
(Reference: Beyvers S et al. (2025) Nucleic Acids Research53(W1): W51–W56).
Also available at Galaxy.eu. Requires registration.


pharokka

pharokka - provides annotations in a fast, scalable and consistent fashion. Pharokka identifies predicted coding sequences (CDS), transfer RNAs (tRNAs), transfer-messenger RNAs (tmRNAs) and clustered regularly interspaced short palindromic repeats (CRISPRs), providing functional annotation for CDS using the PHROGs database
(Reference: Bouras G et al. (2023) Bioinformatics, 39(1): btac776).
Also available at GoogleColab. Requires registration.


Proksee

Proksee - provides users with a powerful, easy-to-use, and feature-rich system for assembling, annotating, analysing, and visualizing bacterial genomes. Proksee accepts Illumina sequence reads as compressed FASTQ files or pre-assembled contigs in raw, FASTA, or GenBank format. Alternatively, users can supply a GenBank accession or a previously generated Proksee map in JSON format. Proksee then performs assembly (for raw sequence data), generates a graphical map, and provides an interface for customizing the map and launching further analysis jobs. Notable features of Proksee include unique and informative assembly metrics provided via a custom reference database of assemblies; a deeply integrated high-performance genome browser for viewing and comparing analysis results at individual base resolution (developed specifically for Proksee); an ever-growing list of embedded analysis tools whose results can be seamlessly added to the map or searched and explored in other formats; and the option to export graphical maps, analysis results, and log files for data sharing and research reproducibility
(Reference: Grant JR et al (2023) Nucleic Acids Res. 51(W1): W484-W492.)


RAST

RAST (Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating bacterial and archaeal genomes. It provides high quality genome annotations for these genomes across the whole phylogenetic tree. Requires registration.
(Reference: Aziz, RK et al. 2008. BMC Genomics 9:75.).


BV-BRC

BV-BRC (Bacterial and Viral Bioinformatics Resource Center) - is a comprehensive resource supporting research on bacterial and viral pathogens. It currently hosts over 14 million publicly available genomes and 33 high-throughput bioinformatic analysis services with numerous visual analytic tools allowing researchers to analyze their private data, generate comparisons with public data, and share data and results with colleagues.
(Reference: Shukla M et al. 2025. Nucl. Acids Res. gkaf1254).


BASys2

BASys2 (Bacterial Annotation System 2.0) - this powerful web server for comprehensive bacterial genome annotation accepts either FASTA or FASTQ files . It identifies all gene types (protein-coding, tRNA, rRNA, etc.) and generates up to 62 annotation fields per gene using over 30 tools and 10 databases. The interactive genome viewer provides detailed, multi-resolution visualizations and clickable gene cards, while also supporting metabolome annotations and 3D protein structure visualizations. Annotations include structural, functional, and statistical data, with results available for download in JSON and GenBank formats. BASys2 delivers fast, extensive, and high-quality genome annotations that rival or exceed those in databases like UniProt.
Reference: Poelzer J et al. 2025. Nucleic Acids Research 53(W1): W57 - W67.


MicroScope

MicroScope - (CEA, Institut de Génomique - Genoscope, France) is a microbial genome annotation & analysis platform which provides access to a wide range of tools including COG analysis, comparative genomics ...
(Reference: Vallenet D et al. (2017) Nucleic Acids Res. 45(D1): D517-D528).
Requires registration.


MAKER Web Annotation Service

MAKER Web Annotation Service (MWAS) is an easily configurable web-accesible genome annotation pipeline. It's purpose is to allow research groups with small to intermediate amounts of eukaryotic and prokaryotic genome sequence (i.e. BAC clones, small whole genomes, preliminary sequencing data, etc.) to independently annotate and analyse their data and produce output that can be loaded into a genome database.
(Reference: Holt, C. & Yandell, M. 2011. BMC Bioinformatics 12:491).


MITOS2

MITOS2 (part of Galaxy,org) - is a pipeline designed to provide consistent and high quality de novo annotation of Metazoan mitochondrial genomes sequences. We show that the results of MITOS match RefSeq and MitoZoa in terms of annotation coverage and quality. At the same time we avoid biases, inconsistencies of nomenclature, and typos originating from manual curation strategies.
(Reference: M. Bernt et al. 2013. Molecular Phylogenetics & Evolution 69:313-319).


GenSAS

GenSAS - Genome Sequence Annotation Server - provides a one-stop website with a single graphical interface for running multiple structural and functional annotation tools, enabling visualization and manual curation of genome sequences. Users can upload sequences into their account and run gene prediction programs, protein homology searches, map ESTs, identify repeats, ORFs and SSRs with custom parameter settings. Each analysis is displayed on separate tracks of the graphical interface with custom editabe tracks to select final annotation of features and create gff3 files for upload to genome browsers such as GBrowse. Additional programs can be easily added using this Drupal based software.


FLAN

FLAN (FLu ANnotation) is an NCBI web server for genome annotation of influenza virus is a tool for user-provided influenza A virus or influenza B virus sequences. It can validate and predict protein sequences encoded by an input flu sequence.
(Reference: Y. Bao et al. 2007. Nucleic Acids Res. (Web Server issue) 35: W280-W284).


GATU

Genome Annotation Transfer Utility (GATU) annotates a genome based on a very closely related reference genome. The proteins/mature peptides of the reference genome are BLASTed against the genome to be annotated in order to find the genes/mature peptides in the genome to be annotated
(Reference: T. Tcherepanov et al. 2006. BMC Genomics 7:150.)


Companion

Companion - allow non-experts to annotate their arthropod, fungal or protozoan genomes using a reference-based method, enabling them to assess the output before submitting to public databases.
(Reference: Haese-Hill W et al. 2024. Nucleic Acids Research 52(W1): W39 - W44).


BioGPS

BioGPS (The Scripps Research Institute, USA) - is a one-stop gene annotation portal that emphasizes user-customizability and community-extensibility It is a customizable gene annotation portal and a complete resource for learning about gene and protein function.


MOSGA

MOSGA - (Modular Open-Source Genome Annotator) - is a genome annotation framework for eukaryotic genomes with a user-friendly web-interface that generates and integrates annotations from various tools. The aggregated results can be analyzed with a fully integrated genome browser and are provided in a format ready for submission to NCBI
(Reference: Martin R et al. 2020. Bioinformatics 36: 5514-5515).


BAGEL

BAGEL (Groningen Biomolecular Sciences and Biotechnology Institute, Haren, the Netherlands) - will determine from an existing or non submitted GenBank file the presence of bacteriocins based on a database containing information of known bacteriocins and adjacent genes involved in bacteriocin activity. See LABioicin if you are interested in the topic of Lactic Acid Bacteria (LAB) and its bacteriocins.


MG-RAST

MG-RAST (Metagenome Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating metagenome samples. It provides annotation of sequence fragments, their phylogenetic classification and an initial metabolic reconstruction. The service also provides means for comparing phylogenetic classifications and metabolic reconstructions of metagenomes
(Reference: F. Meyer et al. 2008. BMC Bioinformatics 9: 386).


Mobile Genetic Elements - Not prophage

MOBHunter

MOBHunter - mobile genetic elements (MGEs) range from small transposons to conspicuous integrative and conjugative elements. These regions often confer advantageous traits, including antibiotic resistance or novel metabolic capabilities, and contain foreign sequence signatures and hallmark genes such as transposases, integrases, etc. While bioinformatic tools target specific MGE subsets using alignments, compositional signatures, or diagnostic gene mapping, no single platform offers a unified framework for comprehensive, evidence-based, MGE identification and classification. MOBHunter is an advanced bioinformatic pipeline that consolidates standalone tools and in-house algorithms.
(Reference: Rojas-Villalobos C et al. 2025. Nucleic Acids Research 53(W1): W398 - W407).


Chromosome replication origin:

Ori-Finder

Ori-Finder - is a useful platform for the identification and analysis of replication origins (oriCs) in the bacterial genomes.
(Reference: Luo H et al. (2019) Brief Bioinform 20(4): 1114-1124).


OriV-Finder

OriV-Finder - is a comprehensive web server for bacterial plasmid replication origin analysis. It uses replication initiation proteins (RIPs) and sequence data to identify replication origins in plasmids.
(Reference: Li Y & Gao F et al. 2025. Nucleic Acids Research 53(W1): W451 - W456).


DoriC

Please note that these tools have been used to create DoriC - a database of replication origins in prokaryotic genomes including chromosomes and plasmids.
(Reference: Luo H & Gao F (2019) Nucleic Acids Res. 47(D1): D74-D77).

Updated: February, 2026