Phylogeny
T-Rex
T-Rex (Tree and
reticulogram REConstruction) - is dedicated to the reconstruction of
phylogenetic trees, reticulation networks and to the inference of
horizontal gene transfer (HGT) events. T-REX includes several popular
bioinformatics applications such as MUSCLE, MAFFT, Neighbor Joining,
NINJA, BioNJ, PhyML, RAxML, random phylogenetic tree generator and some
well-known sequence-to-distance transformation models. It also comprises
fast and effective methods for inferring phylogenetic trees from
complete and incomplete distance matrices as well as for reconstructing
reticulograms and HGT networks
(Reference: Alix, C. et al. 2012. Nucl. Acids Res. 40 (W1): W573-W579).
Phylogeny.fr
Phylogeny.fr - is
a simple to use web service dedicated to reconstructing and analysing
phylogenetic relationships between molecular sequences.It includes
multiple alignment (MUSCLE, T-Coffee, ClustalW, ProbCons), phylogeny
(PhyML, MrBayes, TNT, BioNJ), tree viewer (Drawgram, Drawtree, ATV) and
utility programs (e.g. Gblocks to eliminate poorly aligned positions and
divergent regions)
(Reference: A. Dereeper et al,. 2008. Nucl. Acids Res. 36 (Web Server Issue):W465-9).
Also available
here.
I have used this resource exclusively in the production of ICTV viral
taxonomy proposals(TaxoProps).
NGPhylogeny.fr
NGPhylogeny.fr -
is more flexible in terms of tools and workflows, easily installable,
and more scalable. It integrates numerous tools in their latest version
(e.g. TNT, FastME, MrBayes, etc.) as well as new ones designed in the
last ten years (e.g. PhyML, SMS, FastTree, trimAl, BOOSTER, etc.). These
tools cover a large range of usage (sequence searching, multiple
sequence alignment, model selection, tree inference and tree drawing)
and a large panel of standard methods (distance, parsimony, maximum
likelihood and Bayesian). They are integrated in workflows, which have
been already configured ('One click'), can be customized ('Advanced'),
or are built from scratch ('A la carte').
(Reference: Lemoine F et al. Nucleic Acids Res 47(W1): W260–W265).
iTOL
iTOL (Interactive
Tree Of Life) - is a very impressive online tool for the display,
manipulation and annotation of phylogenetic and other trees.
(Reference: Letunic I & Bork P (2019) Nucleic Acids Res 47(W1): W256-W259)
FastME
FastME
provides distance algorithms to infer phylogenies. FastME is based on
balanced minimum evolution, which is the very principle of NJ. FastME
improves over NJ by performing topological moves using fast,
sophisticated algorithms. The first version of FastME only included
Nearest Neighbor Interchange (NNI). The new 2.0 version also includes
Subtree Pruning and Regrafting (SPR), while remaining as fast as NJ and
providing a number of facilities: distance estimation for DNA and
proteins with various models and options, bootstrapping, and parallel
computations.
(Reference: Lefort V. et al. Molecular Biology & Evolution 32(10): 2798-800, 2015).
PhyML
PhyML
- has been widely used because of its simplicity and a fair compromise
between accuracy and speed. In the meantime research on PhyML has
continued, and new algorithms and methods have been implemented in the
program.
(Reference: V. Lefort et al. (2017) Molecular Biology and Evolution, msx149).
Treeview
Treeview
(part of ETE Toolkit) - the Environment for Tree Exploration (ETE) is a
computational framework that simplifies the reconstruction, analysis,
and visualization of phylogenetic trees and multiple sequence
alignments. Here, we present ETE v3, featuring numerous improvements in
the underlying library of methods, and providing a novel set of
standalone tools to perform common tasks in comparative genomics and
phylogenetics. The new features include (i) building gene-based and
supermatrix-based phylogenies using a single command, (ii) testing and
visualizing evolutionary models, (iii) calculating distances between
trees of different size or including duplications, and (iv) providing
seamless integration with the NCBI taxonomy database.
(Reference: Huerta-Cepas J et al. (2016) Mol Biol Evol. 33(6): 1635-1638).
MAFFT
MAFFT
version 7 (Multiple alignment program for amino acid or nucleotide
sequences)
(Reference: Katoh K., J.Rozewicki & K.D. Yamada (2019) Brief Bioinformatics 20: 1160–1166).
Also includes links to downloadable versions of this program.
Evolview
Evolview
- is an interactive tree visualization tool designed to help researchers
in visualizing phylogenetic trees and in annotating these with
additional information. It offers the user with a platform to upload
trees in most common tree formats, such as Newick/Phylip, Nexus, Nhx
and PhyloXML, and provides a range of visualization options, using
fifteen types of custom annotation datasets. The new version of
Evolview was designed to provide simple tree uploads, manipulation and
viewing options with additional annotation types.
(Reference: Subramanian B et al. (2019) Nucleic Acids Res. 47(W1): W270-W275). Requires registration.
RAxML
RAxML
(Randomized Axelerated Maximum Likelihood) is a program for sequential
and parallel Maximum Likelihood based inference of large phylogenetic
trees
(Reference: Stamatakis, A. 2006. Bioinformatics 22:2688–2690).
Phylemon2
Phylemon2
- a suite of web-tools for molecular evolution, phylogenetics and
phylogenomics
(Reference: Sánchez, R. et al. 2011.Nucl. Acids Res. 39/suppl_2/W470)
Phylodendron
Phylodendron - phylogenetic tree printer (D.G. Gilbert, Indiana Univ.) - very useful in visualizing *.dnd file from aligments and saving the results as .GIF, .PS or .PDF files. N.B. The font style and size can be altered in the .PDF output format.
IQ-TREE
IQ-TREE
- is an intuitive and user-friendly web interface and server for
IQ-TREE, an efficient phylogenetic software for maximum likelihood
analysis. W-IQ-TREE supports multiple sequence types (DNA, protein,
codon, binary and morphology) in common alignment formats and a wide
range of evolutionary models including mixture and partition models.
IQ-TREE performs fast model selection, partition scheme finding,
efficient tree reconstruction, ultrafast bootstrapping, branch tests,
and tree topology tests.
(Reference: Trifinopoulos J et al. (2016) Nucl.Acids Res. 44(Issue W1): W232–W235)
Phylogenetic tree prediction
Phylogenetic tree prediction - GeneBee service (Belozersky Institute of Physico-chemical Biology, Moscow State University, Russia)
Phylogenetic Tree Plot
Phylogenetic Tree Plot (Laboratory of Bioinformatics, Wageningen UR, The Netherlands) - submit tree descriptions in PHYLIP (Newick) format only
Phylogenetic tree (newick) viewer
Phylogenetic tree (newick) viewer - is an online tool for phylogenetic tree view (newick format) that allows multiple sequence alignments to be shown together with the trees (fasta format). It uses the tree drawing engine implemented in the ETE toolkit, and offers transparent integration with the NCBI taxonomy database. Currently, alignments can be displayed in condensed or block-based format. Leaf names in the newick tree should match those in the fasta alignment.
CVTree4
CVTree4
constructs whole-genome based phylogenetic trees without sequence
alignment by using a Composition Vector (CV) approach. It was first
developed to infer evolutionary relatedness of microbial organisms and
then successfully applied to viruses, chloroplasts, and fungi. CVTree3
makes comparison with taxonomy and reports tree-branch monophyleticity
from domain to species.
(Reference: G. Zuo, & B. Hao (2015) Genomics Proteomics & Bioinformatics, 13: 321-331).
webPRANK
webPRANK
- incorporates phylogeny-aware multiple sequence alignment,
visualisation and post-processing in an easy-to-use web interface.
(Reference: Löytynoja, A., & Goldman, N. 2010. BMC Bioinformatics. 11:579).
AnnoTree
AnnoTree
- is an interactive, functionally annotated bacterial tree of life that
integrates taxonomic, phylogenetic and functional annotation data from
over 27 000 bacterial and 1500 archaeal genomes. AnnoTree enables
visualization of millions of precomputed genome annotations across the
bacterial and archaeal phylogenies, thereby allowing users to explore
gene distributions as well as patterns of gene gain and loss in
prokaryotes. Using AnnoTree, we examined the phylogenomic distributions
of 28 311 gene/protein families, and measured their phylogenetic
conservation, patchiness, and lineage-specificity within bacteria.
(Reference: Mendler K et al. (2019) Nucleic Acids Res. 47(9): 4442-4448).
eShadow
eShadow
Evolutionary phylogenetic SHADOWing of closely related species
(Reference: Ovcharenko, D. et al. (2004) Genome Research, 14(6): 1191-1198)
SIFTER
SIFTER
(Statistical Inference of Function Through Evolutionary Relationships) is
a statistical approach to predicting protein function that uses a protein
family's phylogenetic tree, as the natural structure for representing
protein relationships.
(Reference: S.M. Sahraeian et al. 2015. Nucl. Acids Res. 43 (W1): W141-W147).
PATH
PATH
- is a novel method to infer distant homology relations of two proteins,
that accounts for frameshift and point mutations that may have affected
the coding sequences. We design a dynamic programming alignment algorithm
over memory-efficient graph representations of the complete set of
putative DNA sequences of each protein, with the goal of determining the
two putative DNA sequences which have the best scoring alignment under a
powerful scoring system designed to reflect the most probable
evolutionary process.
(Reference: Gîrdea, M. et al. Algorithms for Molecular Biology 5: 6 ; 2010).
ReplacementMatrix
ReplacementMatrix
- maximum-likelihood estimation of amino acid replacement rate matrices.
(Reference: Dang, C.C. et al. 2011. Bioinformatics. 27(19):2758-2760).
DIVEIN
DIVEIN -
starting with a set of aligned sequences, DIVEIN estimates evolutionary
parameters and phylogenetic trees while allowing the user to choose from
a variety of evolutionary models; it then reconstructs the consensus
(CON), most recent common ancestor (MRCA), and center of tree (COT)
sequences. DIVEIN also provides tools for further analyses.
(Reference: Deng, W. et al. 2010. Biotechniques. 48(5):405-408).
REALPHY
REALPHY
(Reference sequence Alignment based Phylogeny) builder - is a free
online pipeline that can infer phylogenetic trees from whole genome
sequence data. The user only has to provide a small number of reference
genomes in either FASTA or GenBank format (contigs or fully sequenced
genomes) as well as a number of other query genomes which can be in
FASTQ (short reads), FASTA or GenBank format. All provided sequences
(references and queries) will then be mapped to each of the references
via bowtie2. From these alignments multiple sequence alignments will be
reconstructed from which phylogenetic trees are inferred via PhyML.
(Reference: Bertels F et al (2014) Mol Biol Evol. 31(5): 1077-1088).
TYGS
TYGS (Type (Strain)
Genome Server) - is a user-friendly high-throughput web server for
genome-based prokaryote taxonomy, connected to a large, continuously
growing database of genomic, taxonomic and nomenclatural information. It
infers genome-scale phylogenies and state-of-the-art estimates for
species and subspecies boundaries from user-defined and automatically
determined closest type genome sequences. TYGS also provides
comprehensive access to nomenclature, synonymy and associated taxonomic
literature. It is linked to List of Prokaryotic names with Standing in
Nomenclature (LPSN).
(Reference: Meier-Kolthoff JP & Göker M. (2019) Nat Commun. 10(1): 2182).
BuscoPhylo
BuscoPhylo
- enables both students and established scientists to easily perform
Busco-based phylogenomic analysis starting from a set of genomes
sequences.
(Reference: Sahbou AE et al (2022) Sci Rep. 12(1): 17352).
PhyloCloud
PhyloCloud
- is an online platform aimed at hosting, indexing and exploring large
phylogenetic tree collections, providing also seamless access to common
analyses and operations, such as node annotation, searching, topology
editing, automatic tree rooting, orthology detection and more. In
addition, PhyloCloud provides quick access to tools that allow users to
build their own phylogenies using fast predefined workflows, graphically
compare tree topologies, or query taxonomic databases such as NBCI or
GTDB. Finally, PhyloCloud offers a novel tree visualisation system based
on ETE Toolkit v4.0, which can be used to explore very large trees and
enhance them with custom annotations and multiple sequence alignments.
(Reference: Deng Z et al (2022) Nucleic Acids Res. 50(W1): W577-W582).
PHYLOViZ
PHYLOViZ
- is a server that allows the analysis of sequence-based typing methods
that generate allelic profiles and their associated epidemiological data.
Our motivation was to give an user-friendly solution for data analysis
and sharing without installing any specific software.
(Reference: Ribeiro-Gonçalves B et al (2016) Nucleic Acids Res. 44(W1): W246-W251).
iBIS2Analyzer
iBIS2Analyzer
- is a web server dedicated to a phylogeny-driven coevolution analysis of
protein families with different evolutionary pressure. It is based on the
iterative version, iBIS2, of the coevolution analysis method BIS, Blocks
in Sequences. iBIS2 is designed to iteratively select and analyse
subtrees in phylogenetic trees, possibly large and comprising thousands
of sequences. With iBIS2Analyzer the user visualizes, compares and
inspects clusters of coevolving residues by mapping them onto sequences,
alignments or structures of choice, greatly simplifying downstream
analysis steps.
(Reference: Oteri F et al. (2022) Nucleic Acids Research 50(W1): W412–W419.)
Molecular Taxonomy
AmphoraNet
AmphoraNet
- is capable of assigning a probability-weighted taxonomic group for each
phylogenetic marker gene found in the input metagenomic sample; the
webserver is based on the AMPHORA2 workflow. It uses 31 bacterial and
104 archaeal protein coding marker genes for metagenomic phylotyping.
Most of these are single copy genes, therefore AmphoraNet is suitable for
estimating the taxonomic composition of bacterial and archaeal
communities from metagenomic shotgun sequencing data.
(Reference: Kerepesi, C. et al. 2014. Gene 533: 538–540).
VIRIDIC
VIRIDIC
(Virus Intergenomic Distance Calculator) - the first level of
bacteriophage classification by
ICTV involves
computing the overall DNA sequence identity between two viruses. This new
tool computes pairwise intergenomic distances/similarities amongst phage
genomes. To run it, upload a single fasta file with all phage genomes of
interest, create a project and press run. Save the project ID that will
be displayed when the project is created. You will need it to access the
data if the calculations take a long time.
(Reference: Moraru C et al (2020). Viruses. 12(11): 1268.)
GGDC - Genome-to-Genome Distance Calculator
GGDC - Genome-to-Genome Distance Calculator
- The species concept for Bacteria and Archaea is ultimately based on
DNA-DNA hybridization (DDH). This web service can be used for
genome-based species delineation with complete or incomplete genomes
sequences. The server calculate intergenomic distances; and, these are
converted into similarity values analogous to DDH and sent to you via
e-mail.
(Reference: Meier-Kolthoff, J.P. et al. (2013). BMC Bioinformatics 14:60).
VICTOR
VICTOR
(Virus Classification and Tree Building Online Resource;
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und
Zellkulturen GmbH). This web service compares bacterial and archaeal
viruses ("phages") using their genome or proteome sequences. The results
include phylogenomic trees inferred using the Genome-BLAST Distance
Phylogeny method (GBDP), with branch support, as well as suggestions for
the classification at the species, genus and family level. (The service
can be applied to other kinds of viruses, too, but has not yet been
tested in this respect.) Upload your FASTA files, GenBank files and/or
GenBank accession IDs.
(Reference: JP Meier-Kolthoff & M Göker. (2017). Bioinformatics 33(21): 3396–3404).
VIRFAM
VIRFAM
is dedicated to the recognition of head-neck-tail modules and of
recombinase genes in phage genomes. You can use this server to search
for remote homologs of specific protein families within protein sequences
of bacteriophages. Input: protein sequences you're your phage; output
includesd a phylogenetic tree with the placement of your virus.
(Reference: Lopes A et al. Nucleic Acids Res. (2010) 38(12): 3952-62).
ANI calculator
ANI calculator
- estimates the average nucleotide identity using both best hits
(one-way ANI) and reciprocal best hits (two-way ANI) between two genomic
datasets. Typically, the ANI values between genomes of the same species
are above 95% while values below 75% are not to be trusted, and AAI
should be used instead. This tool supports both complete and draft
genomes (multi-fasta).
(Reference: Goris J. et al. (2007). Int J Syst Evol Microbiol. 57:81-91).
Average Nucleotide Identity
Average Nucleotide Identity
(ANI) calculator - their ANI Calculator uses the OrthoANIu algorithm, an
improved iteration of the original OrthoANI algorithm, which uses
USEARCH instead of BLAST
(Reference: Yoon, S. H. et al. (2017). Antonie van Leeuwenhoek. 110:1281–1286).
TaxMan
TaxMan
- inspect your rRNA amplicons and taxa assignments - In microbiome
analyses, often rRNA gene databases are used to assign taxonomic names to
sequence reads. The TaxMan server facilitates the analysis of the
taxonomic distribution of your reads in two ways. First, you can check
what taxonomic names are assigned to the sequences produced by your
primers and what taxa you will lose. Second, the produced amplicon
sequences with lineages in the FASTA header can be downloaded. This can
result in a much more efficient analysis with respect to run time and
memory usage, since the amplicon sequences are considerably shorter than
the full length rRNA gene sequences. In addition, you can download a
lineage file that includes the counts of all taxa for your primers and
for the used reference.
(Reference: Brandt, B.W. et al. 2012. Nucleic Acids Research 40:W82-W87).
VipTree
VipTree -
generates a "proteomic tree" of viral genome sequences based on
genome-wide sequence similarities computed by tBLASTx. The original
proteomic tree concept (i.e., "the Phage Proteomic Tree") was developed
by Rohwer and Edwards, 2002. A proteomic tree is a dendrogram that
reveals global genomic similarity relationships between tens, hundreds,
and thousands of viruses. It has been shown that viral groups identified
in a proteomic tree well correspond to established viral taxonomies.
(Reference: Nishimura Y et al. (2017) Bioinformatics 33: 2379–2380)
VirClust
VirClust
– is a novel tool capable of performing i) hierarchical clustering of
viruses based on intergenomic distances calculated from their protein
cluster content, ii) identification of core proteins and iii) annotation
of viral proteins. VirClust groups proteins into clusters both based on
BLASTP sequence similarity, which identifies more related proteins, and
also based on hidden markow models (HMM), which identifies more distantly
related proteins. Furthermore, VirClust provides an integrated
visualization of the hierarchical clustering tree and of the distribution
of the protein content, which allows the identification of the genomic
features responsible for the respective clustering.
(Reference: Moraru C (2023 ) Viruses. 15(4): 1007).
Updated: December, 2025