PROMOTERS & TERMINATORS

A. Bacterial

red_bullet.gif (914 bytes) PromoterHunter - is part of phiSITE database which is a collection of phage gene regulatory elements, genes, genomes and other related information, plus tools. (Reference: Klucar, L. et al. 2010. Nucleic Acids Res. 38(Database Issue): D366-D370).

red_bullet.gif (914 bytes) PhagePromoter - is a tool for locating promoters in phage genomes, using machine learning methods. This is the first online tool for predicting promoters that uses phage promoter data and the first to identify both host and phage promoters with different motifs. It is part of Galaxy.(Reference: Sampaio M et al. (2019) Bioinformatics. 35(24): 5301-5302).

red_bullet.gif (914 bytes) BacPP: Bacterial promoter prediction - A tool for accurate sigma-factor specific assignment in enterobacteria. Includes σ24, σ28, σ32, σ38, σ54 and σ70 with 84-97% accuracy. Requires registration. (Reference: S. de Avila e Silva et al. J. Theor. Biol., 287 (2011): 92–99).

red_bullet.gif (914 bytes) iPro70-PseZNC - is a tool for identifying σ70 promoters with novel pseudo nucleotide composition (Reference: Lai H-Y et al. (2019) Mol Ther Nucleic Acids. 17: 337-346).  I would recommend using <100 nt upstream from the start codon.

red_bullet.gif (914 bytes) iPro54-PseKNC - is a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. (Reference: Lin H et al. (2014) Nucleic Acids Res 42(21): 12961-12972).

red_bullet.gif (914 bytes) Promoter Prediction by Neural Network   (Martin Reese, Lawrence Berkeley Laboratory, CA, U.S.A.) - applicable to eukaryotes and prokarotes (Reference: Reese MG, 2001. Comput Chem 26: 51-56). Dated and prokaryote results must be viewed skeptically.

red_bullet.gif (914 bytes) BPROM (Softberry) - (Reference: V. Solovyev & A Salamov (2011) Automatic Annotation of Microbial Genomes and Metagenomic Sequences. In Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies (Ed. R.W. Li), Nova Science Publishers, p. 61-78)

red_bullet.gif (914 bytes) CNNPromoter_b - Prediction of Bacterial Promoters by CNN models in genomic sequences. (Reference: Umarov RK, & Solovyev VV (2017) PLoS One. 12(2): e0171410).  

red_bullet.gif (914 bytes) Deep Learning Recognition using Convolutional Neural Networks (CNNPromoter & CNNProm) - Classification of Prokaryotic and Eukaryotic Promoters and non-promoter sequences (Reference: Umarov R.K & Solovyev V.V. (2017) PLoS One.12(2): :e0171410.

Virtual Footprint - offers two types of analyses (a) Regulon Analysis - analysis of a whole prokaryotic genome with one regulator pattern and (b) Promoter analysis - Analysis of a promoter region with several regulator patterns (Reference: R. Münch et al. 2005. Bioinformatics 2005 21: 4187-4189).

red_bullet.gif (914 bytes) PePPER (University of Groningen, The Netherlands) is a webserver for prediction of prokaryote promoter elements and regulons (Reference: de Yong, A. et al. 2012. BMC Genomics 13:299). 

 DOOR3 - Database of prOkaryotic OpeRons -  offers high-performance web service for online operon prediction on user-provided genomic sequences; and, an intuitive genome browser to support visualization of user-selected data. Plus a huge database of transcriptional units. (Reference: X. Mao et al. 2014. Nucleic Acids Res. 42(Database issue): D654-9).

red_bullet.gif (914 bytes) PATLOC (Pattern Locator) (Institute of Bioinformatics, University of Georgia, U.S.A.) - is a new tool for finding sequence patterns in long DNA sequences. For this web-based service, a restricted version of Pattern Locator is used, which estimates the time needed for completion of the search and stops if the estimated CPU time exceeds a certain limit (currently 90 seconds). The CPU time limit was introduced in order to protect the web server from overloading due to requests involving too complex sequence patterns.  If you want to search for Sigma-70 (RpoD)-like promoters the pattern syntax for your search is:  <>{TTGACA(N)[15:18]TATAAT}[4].  N.B. the [4] allows for 4 mismatches - I recommend a maximum of two.  If you only want one strand screened omit the <> at the start. You can restrict the search to intergenic regions (but this will eliminate also matches that partially overlap with genes or use the .patvic.txt output file to find where they are (Jan Mrázek, personal communication).

red_bullet.gif (914 bytes) SeqTU (RNA-seq Based Transcription Unit Finder for Prokaryotes) - is a recently developed machine-learning method to accurately identify TUs from RNA-seq data, based on two features of the assembled RNA reads: the continuity and stability of RNA-seq coverage across a genomic region. While good performance was achieved by the method on Escherichia coli and Clostridium thermocellum, substantial work is needed to make the program generally applicable to all bacteria, knowing that the program requires organism specific information.(Reference: Chen X et al. (2017) Sci Rep. 7: 43925).

B. Eukaryotic

Not being a eukaryotic molecular biologist I cannot comment on utility and accuracy of the following promoter- prediction programs.

red_bullet.gif (914 bytes) FindM (Find Motifs around Functional Sites)  - choose Promoter Motifs from Motif Library

red_bullet.gif (914 bytes) Neural Network Promoter Prediction (Berkeley Drosophila Genome Project, U.S.A.) - dated  (Reference: M.G. Reese 2001. Comput. Chem. 26: 51-6).  
red_bullet.gif (914 bytes) Promoter 2.0 Prediction Server (S. Knudsen,Center for Biological Sequence Analysis, Technical University of Denmark) - predicts transcription start sites of vertebrate Pol II promoters in DNA sequences
red_bullet.gif (914 bytes) PROMOSER - Human, Mouse and Rat promoter extraction service (Boston University, U.S.A.) - maps promoter sequences and transcription start sites in mammalian genomes. (Reference: S. Anason et al. 2003. Nucl. Acids. Res. 2003 31: 3554-59).
red_bullet.gif (914 bytes) Promoter and gene expression regulatory motifs search (Softberry, U.S.A.) - offers a variety of promoter-scanning programs  

red_bullet.gif (914 bytes) CNNPromoter_e - Prediction of Eukaryotic Promoters by CNN models in genomic sequences. (Reference: Umarov RK, & Solovyev VV (2017) PLoS One. 12(2): e0171410). 

C. Transcriptional terminators - these only apply to rho-independent terminators;  for rho-dependent termiantor sites see

 Transcription Terminator Prediction  (Anne de Jong, University of Groningem, The Netherlands) - is part of the excellent Genome2D webserver for Analysis and Visualization of Bacterial Genomes and Transcriptomes

 WebGeSTer - Genome Scanner for Terminators - my favourite terminator search program is finally web enabled.  Please note that if you want to analyze data from a *.gbk file you need to use  their conversion program "GenBank2GeSTer" first. A complete description of each terminator including a diagram is produced by this program.  This site linked to an extensive database of transcriptional terminators in bacterial genome (WebGeSTer DB) (Reference: Mitra A. et al. 2011.  Nucl. Acids Res. 39(Database issue):D129-35).

 ARNold -  finds rho-independent terminators in nucleic acid sequences using two complementary programs, Erpin and RNAmotif. The program colors the terminator stem and loop (References: Gautheret D, Lambert A. 2001.  J Mol Biol. 313:1003–11 & Macke T. et al. 2001. Nucleic Acids Res. 29:4724–4735 ).

FindTerm (Softberry Inc.) - can also be used for mapping rho-independent terminators. You might consider using the advanced feature options and minimally increase the default energy threshold to -12.0. Please note that the online version of this program will only find one terminator at a time.  If you are dealing with a long sequence, once you have located a terminator, delete it from the DNA sequence and resubmit.

 RibEx: Riboswitch Explorer - scans <40kb DNA for potential genes (which are linked to BLASTP) and several hundred regulatory elements, including riboswitches. If you click on the "search for attenuators" it finds terminators and antiterminators. (Reference: C. Abreu-Goodger & E. Merino. 2005. Nucl. Acids Res. 33: W690-W692).

 iTerm-PseKNC - is a webserver for the identification of bacterial transcriptional terminators based on machine learning method. In the predictor, 5-tuple nucletide frequency and physicochemical property were extracted to formulate samples. The binomial distribution technique was proposed to rank 1024 5-tuple nucleotides. Then the incremental feature selection (IFS) was used to determine the optimal features which could produce the maximum accuracy. The support vector machine (SVM) was utilized to perform prediction. Five-fold cross-validated results showed that 86.07% terminators and 99.46% non-terminators can be correctly recognized, respectively. (Reference: Lai H-Y et al. (2019) Mol Ther Nucleic Acids. 17: 337-346).