Bacteriophage
Naming your bacteriophage:
This is of prime importance for members of the bacterial virus community
to name their newly isolated phages appropriately. A good place to start
is
"How to Name and Classify Your Phage: An Informal Guide."
(Reference: Adriaenssens E & Brister JR. 2017. Viruses
9(4). pii: E70)
to which I will add the following points
(a) please check that the name you propose has not been used already;
and,
(b) Do not name your phage Enterobacteria phage ø1234 or Enterobacteria
phage 2017/ABC_567 since these names are incompatable with the creation of
new species and genera taxa by the International Committee on Taxonomy of
Viruses (ICTV).
To find if your proposed name is unique consult:
Phage Name Check
Phage Name Check (Stephen T. Abedon, Ohio State University, USA) - to see whether 'your' phage name is currently found on Google Scholar, Google Books, PubMed, or even Bacteriophage Names 2000.
CPT Phage Name Search
CPT Phage Name Search (Center for Phage Technology at Texas A&M University)
Other useful phage resources:
PhageScope
PhageScope
- applying fifteen state-of-the-art tools to perform systematic
annotations and analyses, PhageScope provides annotations on genome
completeness, host range, lifestyle information, taxonomy classification,
nine types of structural and functional genetic elements, and three types
of comparative genomic studies for curated phages. Additionally,
PhageScope incorporates automatic analyses and visualizations for curated
and customized phages, serving as an efficient platform for phage study
(Reference: Wanng RH et al (2024) Nucleic Acids
Research, 52(D1): D756-D761).
VIRALpro
VIRALpro
- is an effective tool for identifying capsid and tail protein sequences,
which are the cornerstones toward viral sequence annotation and viral
genome classification. It is part of the SCRATCH Protein Predictor as
useful protein analysis meta site
(Reference: Galiez C et al. (2016) Bioinformatics,
32(9): 1405-1407).
taxMyPhage
taxMyPhage
- is a system for the rapid automated classification of dsDNA
bacteriophage genomes. The system integrates a MASH database, built from
ICTV-classified phage genomes to identify closely related phages,
followed by BLASTn to calculate intergenomic similarity, conforming to
ICTV guidelines for genus and species classification. The system also
detected inconsistencies in current ICTV classifications, identifying
cases where genera did not adhere to ICTV's 70% average nucleotide
identity (ANI) threshold for genus classification or 95% ANI for species.
(Reference: Millard A et al. (2025) Phage (New
Rochelle). 6(1): 5-11).
TaxaBLAST
TaxaBLAST - will search a database composed of all ICTV species exemplar genomic nucleotide sequences for exemplars most similar to your chosen query sequence. The BLAST database used for the search is derived from the most recent Virus Metadata Resource (VMR) release. The results will provide an indication of how similar your virus sequence is to the sequences of other viruses that are representative of the virus isolates classified into a particular species.
VIRIDIC
VIRIDIC
(Virus Intergenomic Distance Calculator; C. Moraru, Institute for
Chemistry and Biology of the Marine Environment, Germany) - the first
level of bacteriophage classification by
ICTV
involves computing the overall DNA sequence identity between two viruses.
This new tool computes pairwise intergenomic distances/similarities
amongst phage genomes. To run it, upload a single fasta file with all
phage genomes of interest, create a project and press run. Save the
project ID that will be displayed when the project is created. You will
need it to access the data if the calculations take a long time
(Reference: Moraru C et al. 2020. Viruses. 12(11):
1268).
VipTree
VipTree
- generates a "proteomic tree" of viral genome sequences based on
genome-wide sequence similarities computed by tBLASTx. The original
proteomic tree concept (i.e., "the Phage Proteomic Tree") was developed
by Rohwer and Edwards, 2002. A proteomic tree is a dendrogram that
reveals global genomic similarity relationships between tens, hundreds,
and thousands of viruses. It has been shown that viral groups identified
in a proteomic tree well correspond to established viral taxonomies.
(Reference: Nishimura Y et al. (2017) Bioinformatics
33: 2379-2380).
PhaBOX
PhaBOX
- can comprehensively identify and analyze phage contigs in metagenomic
data. It supports integrated phage analysis, including phage contig
identification from the metagenomic assembly, lifestyle prediction,
taxonomic classification, and host prediction. Instead of treating the
algorithms as a black box, PhaBOX also supports visualization of the
essential features for making predictions. The web server is designed
with a user-friendly graphical interface that enables both
informatics-trained and nonspecialist users to analyze phages in
microbiome data with ease. Click on Pipeline to get started
(Reference: Shang J et al. (2023) Bioinform Adv
3(1): vbad101).
PhageGE
PhageGE
(Phage Genome Explorer) - is a user-friendly graphical interface
application for the interactive analysis of phage genomes. PhageGE
enables users to perform key analyses, including phylogenetic analysis,
visualization of phylogenetic trees, prediction of phage life cycle, and
comparative analysis of phage genome annotations.
(Reference: Zhao J et al (2024) Gigascience 13:
giae074).
PhageAI
PhageAI
- allows to access more than 10 000 publicly available bacteriophages and
differentiate between their major types of life cycles: lytic and
lysogenic. The tool included life cycle classifier which achieved 98.90%
accuracy on a validation set and 97.18% average accuracy on a test set.
(Reference: PiotrTynecki, ArkadiuszGuzinski,
JoannaKazimierczak, MichalJadczuk, JaroslawDastych, AgnieszkaOniskodoi:
https://doi.org/10.1101/2020.07.11.198606 ).
Requires free registration.
VAPEX
VAPEX
(Virus And Phage EXplorer) - is an interactive web server for the deep
exploration of natural virus and phage genomes his tool enables users to
easily perform various genomic analysis queries on all natural viruses
and phages that have been fully sequenced and are listed in the NCBI
compendium. VAPEX therefore excels in producing visual depictions of
fully resolved synteny maps, which is one of its key strengths. VAPEX has
the ability to exhibit a vast array of orthologous gene classes
simultaneously through the use of symbolic representation. Additionally,
VAPEX can fully analyze user-submitted viral and phage genomes, including
those that have not yet been annotated.
(Reference: Hepp B et al. (2023) Bioinformatics
39(8): btad528).
PhaGAA
PhaGAA
- is an integrated web server platform for phage genome annotation and
analysis. By incorporating several annotation tools, PhaGAA is
constructed to annotate phage genome at DNA- and protein-levels and
provide the analytical results. DNA-based annotation includes host
prediction, closest phage search, lifestyle recognition, promoter and
spanin gene identification. Protein-based annotation composes of virion
protein identification, protein domain, structural and functional proteins
classification.
(Reference: Wu J et al. 2023. Bioinformatics 39(3):
btad120).
DePolymerase Predictor
DePolymerase Predictor
- phage depolymerases are able to degrade the extracellular matrix that is integral to the formation of all biofilms and as such would allow complementary therapies or disinfection procedures to be successfully applied. In this manuscript, they describe the development and application of a machine learning based approach towards the identification of phage depolymerases.
(Reference: Magill DJ & Skvortsov TA. 2023. BMC Bioinformatics 24:208).
PhageDPO
PhageDPO
- is trained on a comprehensive dataset that includes sequences related to seven specific DPO-related domains, completed with DPOs validated in the literature. Training a Support Vector Machine (SVM) model resulted in a test accuracy of 96 %, a recall of 97 %, a precision of 94 % and a F1-score of 96 %, demonstrating its capability in predicting DPOs in phage genomes.
(Reference: Vieira MF et al. 2025. Comput Biol Med 188:109836).
VIRFAM
VIRFAM
- allows automated classification of tailed bacteriophages according to
their neck organization. This webserver automatically identifies proteins
of the phage head-neck-tail module and assign phages to the most closely
related cluster of phages.
(Reference: Lopez A (2014) BMC Genomics 15: 1027)
PVP-SVM
PVP-SVM
- is a web-based prediction server for bacteriophage virion proteins. SVM-based prediction model was developed using the optimal feature set selected by random forest. The optimal feature set includes, amino acid composition, atomic composition, dipeptide composition, and physiochemical properties as an input feature. For a given peptide, PVP-SVM predicts its class and probability values.
(Reference: Manavalan B, Shin TH & Lee G. (2018) Front Microbiol. 9: 476).
PHISDetector
PHISDetector
- Phage-microbe interactions leave diverse signals in bacterial and phage
genomic sequences, defined as phage-host interaction signals (PHISs),
which include clustered regularly interspaced short palindromic repeats
(CRISPR) targeting, prophage, and protein-protein interaction signals. In
the present study, we developed a novel tool phage-host interaction
signal detector (PHISDetector) to predict phage-host interactions by
detecting and integrating diverse in silico PHISs, and scoring the
probability of phage-host interactions using machine learning models
based on PHIS features.
(Reference: Zhou F et al (2022) Genomics Proteomics
Bioinformatics. 20(3): 508-523).
PhageTailFinder
PhageTailFinder - is a novel software suitable for high-throughput phage tail region detection. It required phage genomic sequences in GenBank or FASTA format as input.
Pharokka
Pharokka
- was created to provide a tool that annotates bacteriophage genomes easily, rapidly and consistently with standards compliant outputs. This resources is available through Galaxy.eu. A potential problem with this program is that in certain cases CDSs overlap with tRNA genes.
(Reference: Bouras G et al. 2023. Bioinformatics 39(1):btac776).
Pharokka, Phold & Phynteny
Pharokka, Phold & Phynteny
- Pharokka, Phold and Phynteny are complimentary tools and when used together, they substantially increase the annotation rate of your phage genome. They can be accessed through Google Colab Notebook here.
Phold - is an annotation framework utilizing protein structural information that combines the ProstT5 protein language model and structural alignment tool Foldseek. Phold assigns annotations using a database of over 1.36 million predicted phage protein structures with high-quality functional labels. Benchmarking reveals that Phold outperforms existing sequence-based homology approaches in functional annotation sensitivity whilst maintaining speed, consistency, and scalability.
(Reference: Bouras G. et al. 2026. Nucleic Acids Res 54(1):gkaf1448).
Phynteny- is a genome-scale, deep learning framework that leverages gene synteny to predict the function of unknown bacteriophage genes. Phynteny integrates protein language model embeddings with positional encoding, bidirectional long short-term memory, and transformer encoders featuring circular attention to learn genome-wide organisational patterns.
(Reference: Grigson SR at al 2025. bioRxiv doi:https://doi.org/10.1101/2025.07.28.667340).
PhageLeads
PhageLeads
- this tool consists of an ensemble of machine-learning-based predictors for determining the presence of temperate markers (integrase, Cro/CI repressor, immunity repressor, DNA partitioning protein A, and antirepressor) along with the integration of the ABRicate tool to determine the presence of antibiotic resistance genes and virulence genes. Using the biological features of the temperate markers, we were able to predict the presence of the temperate markers with high MCC scores (>0.70), corresponding to the lifestyle of the phages with an accuracy of 96.5%.
(Reference: Yukgehnaish K et al. 2022. Viruses 14(2): 342)
PhageTerm
PhageTerm
- relies on the detection of biases in the number of reads, which are observable at natural DNA termini compared with the rest of the phage genome. It was validated using a set of phages with well-established packaging mechanisms representative of the termini diversity, i.e. 5'cos (Lambda), 3'cos (HK97), pac (P1), headful without a pac site (T4), DTR (T7) and host fragment (Mu). PhageTerm is available through Galaxy.pasteur.
(Reference:Garneau JR et al. 2017. Sci Rep. 7(1):8292).
PhageClouds
PhageClouds
- a total of 640,000 phage genomic sequences were retrieved from a variety of databases and public
virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for
handling massive data sets. These data were used to build a Neo4j graph database.
(Reference: Rangel-Pineros G et al. 2021. PHAGE. 2(4))
The results below were generated using the members of the Herelleviridae, presented in the above mentioned paper.
Updated: February, 2026