Precautionary Quote: "We should be quite remiss not to emphasize that despite the popularity of secondary structural prediction schemes, and the almost ritual performance of these calculations, the information available from this is of limited reliability. This is true even of the best methods now known, and much more so of the less successful methods commonly available in sequence analysis packages. Running a secondary structure prediction on a newly-determined sequence just because everyone else does so, is to be deplored, and the fact that the results of such predictions are generally ignored is insufficient justification for doing and publishing them." Arthur Lesk, 1988.

red_bullet.gif (914 bytes) YASPIN secondary structure prediction - is a HNN (Hidden Neural Network) secondary structure prediction program that uses the PSI-BLAST algorithm to produce a PSSM for the input sequence, which it then uses to perform its prediction. (Reference: K. Lin et al. 2005. Bioinformatics 21:152-9).

red_bullet.gif (914 bytes) PredictProtein 2013 (Technical University of Berlin, Germany) -  they have substantially expanded the breadth of structural annotations, e.g. by adding predictions of non-regular secondary structure and intrinsically disordered regions, disulphide bridges and inter-residue contacts, and finally by also covering trans-membrane beta barrels structures. They have also added important resources for the prediction of protein function.  registration required.

red_bullet.gif (914 bytes) 1D Protein Structure Prediction Server - is designed for the three types of prediction: Secondary Structure Content Prediction, Structural Class Prediction, and Fold Type Prediction based on a protein sequence.

For a metasite linked to a wide range of protein sequence analysis and structure predictions online programs, I recommend PredictProtein (ROSTLAB, Technische Universität München). Also see: SCRATCH Protein Predictor (Institute for Genomics & Bioinformatics, University of California, Irvine, U.S.A.)

Several great sites for online analysis of potential membrane spanning proteins are: (Test sequence ; see Orientation of Proteins in Membranes for 268 unique a-helical membrane protein structures)

red_bullet.gif (914 bytes) TMpred - Prediction of trans-membrane regions and orientation - ISREC (Swiss Institute for Experimental Cancer Research)
red_bullet.gif (914 bytes) TMHMM - Prediction of transmembrane helices in proteins (Center for Biological Sequence Analysis, The Technical University of Denmark)
red_bullet.gif (914 bytes) DAS - Transmembrane Prediction Server (Stockholm University, Sweden)
red_bullet.gif (914 bytes)
SPLIT (D. Juretic, Univ. Split , Croatia) - the transmembrane protein topology prediction server  provides clear and colourful output including beta preference and modified hydrophobic moment index.

red_bullet.gif (914 bytes) OCTOPUS - Using a novel combination of hidden Markov models and artificial neural networks, OCTOPUS predicts the correct topology for 94% of the a dataset of 124 sequences with known structures. (Reference: Viklund, H.  & Elofsson, A. 2008.  Bioinformatics 24: 1662-1668)

red_bullet.gif (914 bytes) Phobius - is a combined transmembrane topology and signal peptide predictor (Reference: L. Käll et al. 2004. J. Mol. Biol. 338: 1027-1036) This tool can also be accessed here and here

red_bullet.gif (914 bytes) SPOCTOPUS  will  also do this.

red_bullet.gif (914 bytes) RHYTHM - predicts the orientation of transmembrane helices in channels and membrane-coils, specifically buried versus exposed residues. (Reference: A. Rose et al. 2009. Nucl. Acids Res. 37(Web Server issue):W575-W580)

red_bullet.gif (914 bytes) TMMOD - Hidden Markov Model for Transmembrane Protein Topology Prediction (Dept. Computer & Information Sciences, University of Delaware, U.S.A.) - on the results page click on "show posterior probabilities" to see a TMHMM-type diagram

red_bullet.gif (914 bytes) PRED-TMR2 (C. Pasquier & S.J.Hamodrakas,Dept. Cell Biology and Biophysics, Univ. Athens, Greece) - when applied to several test sets of transmembrane proteins the system gives a perfect prediction rating of 100% by classifying all the sequences in the transmembrane class. Only 2.5% error rate with nontransmembrane proteins.

red_bullet.gif (914 bytes) TOPCONS - computes consensus predictions of membrane protein topology using a Hidden Markov Model (HMM) and input from five state-of-the-art topology prediction methods. (Reference: A. Bernsel et al. 2009. Nucleic Acids Res. 37(Webserver issue), W465-8) .  For a batch server without BLAST runs use TOPCONSsingle.

red_bullet.gif (914 bytes) MINNOU (Membrane protein IdeNtificatioN withOUt explicit use of hydropathy profiles and alignments) -  predicts alpha-helical as well as beta-sheet transmembrane (TM) domains based on a compact representation of an amino acid residue and its environment, which consists of predicted solvent accessibility and secondary structure of each amino acid. (Reference: Cao et al. 2006. Bioinformatics 22: 303-309). A legend to help interpret the results in here.

red_bullet.gif (914 bytes) SuperLooper -  provides the first online interface for the automatic, quick and interactive search and placement of loops in proteins. (Reference: P.W. Hildebrand et al. 2009. Nucl. Acids Res. 37(Web Server issue):W571-W574) )

red_bullet.gif (914 bytes) Transmembrane Kink Predictor (TMKink) - A hallmark of membrane protein structure is the large number of distorted transmembrane helices. Because of the prevalence of bends, it is important to not only understand how they are generated but also to learn how to predict their occurrence. Here, we find that there are local sequence preferences in kinked helices, most notably a higher abundance of proline, which can be exploited to identify bends from local sequence information. A neural network predictor identifies over two-thirds of all bends (sensitivity 0.70) with high reliability (specificity 0.89). (Reference: Meruelo AD et al. 2011. Protein Sci. 20:1256-64)

red_bullet.gif (914 bytes) SCMMTP Scoring Card Method Membrane Transport Proteins : Identifying and characterizing membrane transport proteins using propensity scores of dipeptides. The training and test accuracies of SCMMTP are 83.81% and 76.11%, respectively. (Reference: Vasylenko, T. et al. 2015.  BMC Bioinformatics, 16 (Suppl 1):S8, 2015)

For drawing the structure of transmembrane proteins two sites are available:

red_bullet.gif (914 bytes) Protter - an open-source tool for interactive integration and visualization of annotated and predicted protein sequence features together with experimental proteomic evidence. Protter supports numerous proteomic file formats and automatically integrates a variety of reference protein annotation sources, which can be readily extended via modular plug-ins. A built-in export function produces publication-quality customized protein illustrations, also for large datasets. (Reference: U. Omasits et al. 2014. Bioinformatics.  30:884-886). Diagram of the holin from bacteriophage lambda generated with Protter:

red_bullet.gif (914 bytes) TOPO2 (S. Johns, UCSF Sequence Analysis Consulting Service, U.S.A.) - this site provides considerable control over the presentation. Extensive documentation is provided here.

red_bullet.gif (914 bytes) TMRPres2D (TransMembrane protein Re-Presentation in 2 Dimensions tool) - this Java tool takes data from a variety of protein folding servers and creates uniform, two-dimensional, high analysis graphical images/ models of alpha-helical or beta-barrel transmembrane proteins. (Reference: I.C. Spyropoulos et al. 2004. Bioinformatics 20: 3258-3260).

Signal peptide recognition & subcellular localization:

A. Bacterial proteins

red_bullet.gif (914 bytes) PSORTb (Brinkman Lab, Simon Fraser Univ., Canada) - provides probably the most accurate bacterial protein subcellular localization predictor.  Alternatively use PSORT (Univ. Tokyo, Japan) - a series of programs for the prediction of protein localization sites in cells. Choose programs specific for  for animal, yeast, plant or bacterial ( Gram-negative or Gram- positive) proteins.
red_bullet.gif (914 bytes) PSLpred - is a SVM based method, predicts 5 major subcellular localization (cytoplasm, inner-membrane, outer- membrane, extracellular, periplasm)  of Gram-negative bacteria. This method includes various SVM modules based on different features of the proteins. The  hybrid approach achieved an overall accuracy of 91%,  which is best among all the existing methods for the subcellular localization of prokaryotic proteins. (Reference:
M. Bhasin et al. (2005) Bioinformatics 21: 2522-2524.)

red_bullet.gif (914 bytes) CELLO  subCELlular LOcalization predictive system -  assigns Gram-negative proteins to the cytoplasm , inner membrane, periplasm, outer membrane or extracellular space with overall prediction accuracy of ca. 89% . Also analyzes eukaryotic and Gram-positive proteins. (Reference: C.S. Yu et al. 2004. Protein Sci.  13:1402-1406). The updated
CELLO2GO (Protein subCELlular LOcalization Prediction with Functional Gene Ontology Annotation) - CELLO2GO should be a useful tool for research involvingcomplex subcellular systems because it combines CELLO and BLAST into one platform and its output is easily manipulated such that the user-specific questions may be readily addressed (Reference: Yu CS et al. 2014. PLoS ONE 9: e99368).

red_bullet.gif (914 bytes) SignalP - predicts the presence and location of signal peptide cleavage sites in Gram-positive, Gram-negative and eukaryotic proteins (Center for Biological Sequence Analysis, The Technical University of Denmark).   For an example of a periplasmic protein use test sequence MalE.
red_bullet.gif (914 bytes) Phobius - is a combined transmembrane topology and signal peptide predictor (Reference: L. Käll et al. 2004. J. Mol. Biol. 338: 1027-1036).
red_bullet.gif (914 bytes) LipoP 1.0 (Center for Biological Sequence Analysis Technical University of Denmark) - allows prediction of where signal peptidases I & II cleavage sites from Gram negative bacteria will cleave a protein.

red_bullet.gif (914 bytes) SecretomeP - produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Reference: J.D. Bendtsen et al. 2005. BMC Microbiology 5: 58).

red_bullet.gif (914 bytes) SSPRED - Identification & classification of proteins involved in bacterial secretion systems.  Do not submit more than four proteins at once. (Reference: Pundhir, S., & Kumar, A. 2011. Bioinformation 6: 380-382).

red_bullet.gif (914 bytes) Signal Find Server - includes (a) FlaFind which predicts archaeal class III (type IV pilin-like) signal peptides (class III signal peptides) and their prepilin peptidase cleavage sites; (b) EppA-pilinFind which predicts class III signal peptides processed by a unique archaeal prepilin peptidase, EppA; (c) TatFind which predicts archaeal AND bacterial Twin-Arginine Translocation (Tat) signal peptides; (d) PilFind which predicts bacterial type IV pilin-like signal peptides and their prepilin peptidase cleavage sites; and, (e) TatLipo which predictes haloarchaeal Tat signal peptides that contain a SPase II cleavage site (lipobox).

red_bullet.gif (914 bytes) Signal-3L 2.0 - is an online server for predicting the N-terminal protein signal peptide, and the input is the amino acid sequence only. It is constructed with a hierarchical mixture model, which contains the following three layers: (1) Discrimination of SP (Signal Peptide) proteins and TMH (TransMembrane Helical) proteins from the other globular proteins; (2) Recognizing SP proteins from TMH proteins; and, (3) Identifying the cleavage sites of SP proteins. (Reference: Y-Z. Zhang & H-B. Shen. Journal of Chemical Information and Modeling, 2017, 57: 988-999)

red_bullet.gif (914 bytes) PrediSi - PREDIction of SIgnal peptides (Karsten Hiller, Technical University of Braunschweig)

red_bullet.gif (914 bytes) Signal Find Server - provides several distinct programs: (a) FlaFind predicts archaeal class III (type IV pilin-like) signal peptides (class III signal peptides) and their prepilin peptidase cleavage sites. (b) EppA-pilinFind predicts class III signal peptides processed by a unique archaeal prepilin peptidase, EppA. (c) TatFind predicts archaeal AND bacterial Twin-Arginine Translocation (Tat) signal peptides. (d) PilFind predicts bacterial type IV pilin-like signal peptides and their prepilin peptidase cleavage sites. (e) TatLipo predictes haloarchaeal Tat signal peptides that contain a SPase II cleavage site (lipobox). 

B. Eukaryotic proteins

red_bullet.gif (914 bytes) Protein Prowler Subcellular Localisation Predictor - The subcellular localisation predictor is largely based on TargetP. (Reference: M. Boden & J. Hawkins. 2005. Bioinformatics 21: 2279-2286).

red_bullet.gif (914 bytes) DeepLoc-1.0 predicts the subcellular localization of eukaryotic proteins. It can differentiate between 10 different localizations: Nucleus, Cytoplasm, Extracellular, Mitochondrion, Cell membrane, Endoplasmic reticulum, Chloroplast, Golgi apparatus, Lysosome/Vacuole and Peroxisome. Their model achieves a good accuracy (78% for 10 categories; 92% for membrane-bound or soluble), outperforming current state-of-the-art algorithms, including those relying on homology information. (Reference: Almagro Armenteros JJ et al. 2017. Bioinnformatics; 33(21): 3387-3395).

red_bullet.gif (914 bytes) WoLF-PSORT (National Institute of Advanced Science and Technology, Japan)
red_bullet.gif (914 bytes) pTARGET - This method can predict proteins targeted to nine distinct subcellular locations that include cytoplasm, endoplasmic reticulum, extracellular/secreted, Golgi, lysosomes, mitochondria, nucleus, peroxysomes and plasma membrane. Compared with PSORT showed that pTARGET prediction rates are higher by 11–60% in 6 of the 8 locations tested. (Reference: C. Guda &
S. Subramaniam. 2005. Bioinformatics 2005 21:3963-3969)
red_bullet.gif (914 bytes) ProtComp (Softberry, U.S.A.) can be used to predict the subcellular localization for animal/fungal and plant proteins.

red_bullet.gif (914 bytes) SecretomeP - produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Reference: J.D. Bendtsen et al. 2005. BMC Microbiology 5: 58).

Other sites for secondary structure predictions include:

red_bullet.gif (914 bytes) JPred4 - is the latest version of the popular JPred protein secondary structure prediction server which provides predictions by the JNet algorithm, one of the most accurate methods for secondary structure prediction. In addition to protein secondary structure, JPred also makes predictions of solvent accessibility and coiled-coil regions. JPred4 features higher accuracy, with a blind three-state (a-helix, ß-strand and coil) secondary structure prediction accuracy of 82.0% while solvent accessibility prediction accuracy has been raised to 90% for residues <5% accessible. (Reference: A. Drozdetskiy et al. 2015.Nucl. Acids Res. 43 (W1): W389-W394).

red_bullet.gif (914 bytes) YASPIN secondary structure prediction -  a Hidden Neural Network secondary structure prediction program that uses the PSI-BLAST algorithm to produce a Position Specific Scoring Matrix for the input sequence, which it then uses to perform its prediction. It was trained on 2896 structures from the PDB40 database. (Reference: K. Lin et al. 2005. Bioinformatics 21:152-159).

red_bullet.gif (914 bytes) Network Protein Sequence @nalysis at IBCP - (Institut de Biologie et Chemie des Proteines, Lyon, France) - has DSC, GORIV, Predator, SOPMA and Heirarchical Neural Network Method plus older programs. 

 PSIPRED Protein Sequence Analysis Workbench - includes PSIPRED v3.3 (Predict Secondary Structure); DISOPRED3 & DISOPRED2 (Disorder Prediction); pGenTHREADER (Profile Based Fold Recognition); MEMSAT3 & MEMSAT-SVM (Membrane Helix Prediction); BioSerf v2.0 (Automated Homology Modelling); DomPred (Protein Domain Prediction); FFPred 3 (Eukaryotic Function Prediction); GenTHREADER (Rapid Fold Recognition); MEMPACK (SVM Prediction of TM Topology and Helix Packing)   pDomTHREADER (Fold Domain Recognition); and, DomSerf v2.0 (Automated Domain Modelling by Homology). (Reference: Buchan DWA et al. 2013.  Nucl. Acids Res.  41 (W1): W340-W348).   

 For a full range of properties of your protein including hydrophobicity, alpha helix, beta-sheet plots see ProScale (ExPASy, Switzerland).

Disordered states:

Many proteins containing regions that do not form well-defined structures and the following new programs help define these regions:

red_bullet.gif (914 bytes) RONN (Regional Order Neural Network) - (Reference: Z.R. Yang et al. 2005. Bioinformatics 21: 3369-3376).
red_bullet.gif (914 bytes) IUPred - The underlying assumption is that globular proteins are composed of amino acids which have the potential to form a large number of favorable interactions, whereas intrinsically unstructured proteins (IUPs) adopt no stable structure because their amino acid composition does not allow sufficient favorable interactions to form. (Reference: Z.
Dosztányi et al. 2005. Bioinformatics 21: 3433-3434).
red_bullet.gif (914 bytes) DISOPRED2 (Reference: J.J. Ward et al. 2004. J. Molec. Biol. 337: 635-645).
 (914 bytes) metaPrDOS - is a meta server to predict natively disordered regions of a protein chain from its amino acid sequence. metaPrDOS returns disorder tendency of each residue as prediction results.(Reference: T. Ishida & K. Kinoshita. 2008. Bioinformatics 24: 1344-1348)

 (914 bytes) MFDp (Multilayered Fusion-based Disorder predictor) - aims to improve over the current disorder predictors.(Reference: M.J. Mizianty et al. 2010. Bioinformatics 26: i489-i496)

 (914 bytes) MoRFpred - Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. MoRFpred is a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins which identifies all MoRF types (a, ß, coil and complex). (Reference: F.M. Disfani et al. 2012. Bioinformatics 28: i75-i83).

red_bullet.gif (914 bytes) Scooby-domain (Sequence hydrophobicity predicts domains) is a method to identify globular regions in protein sequence that are suitable for structural studies. The Scooby-domain JAVA applet can be used as a tool to visually identify 'foldable' regions in protein sequence. Interesting graphics. (Reference: R.A. George et al. 2005. Nucl. Acids Res. 33: W160-W163).

For estimations on the antigenicity of regions of proteins see:

red_bullet.gif (914 bytes) Antigenicity Plot (JaMBW module) - Given a sequence of amino acids, this program computes and plots the antigenicity along the polypeptide chain, as predicted by the algorithm of Hopp & Woods (1981).
red_bullet.gif (914 bytes)
SAbPred: a structure-based antibody prediction server (Reference: J. Dunbar et al. Nucleic Acids Res. 2016; 44(Web Server issue): W474–W478.

red_bullet.gif (914 bytes) EMBOSS Antigenic (EMBOSS package) - this program predicts potentially antigenic regions of a protein sequence, using the method of Kolaskar & Tongaonkar (1990). Also accessible here.

red_bullet.gif (914 bytes) OptimumAntigen™ Design Tool (GenScript) - peptides are optimized using the industry's most advanced antigen design algorithm. Each peptide is measured against several protein databases to confirm the desired epitope specificity. Benefits of using the OptimumAntigen™ Design Tool include avoidance of unexposed epitopes, ability to specify desired cross-reactivity, strong antigenicity of chosen peptide, identification of the best conjugation and presentation options for your desired assay(s), use of built in peptide tutorial for synthesis and solubility, and guaranteed immune response.

red_bullet.gif (914 bytes) EpiC (The ProteomeBinders Epitope Choice Resource) collates and presents a structure-function summary and antigenicity prediction of your protein to help you design antibodies that are appropriate to your planned experiments. (Reference: Haslam, N. & Gibson. T. Proteome Res., 2010, 9 (7): 3759–3763).

red_bullet.gif (914 bytes) SVMTriP - is a method to predict antigenic epitopes using support vector machine to integrate tri-peptide similarity and propensity.  (Reference:  B. Yao et al. PLoS ONE (2012); 7(9):e45152).

To screen for coiled-coil regions in proteins use:

red_bullet.gif (914 bytes) Coils - prediction of coiled coil regions in troteins (Swiss node of EMBnet, Switzerland) - (Reference: A. Lupas et al. 1991 Science 252: 1162-1164). See also PCOILS and MARCOILS
red_bullet.gif (914 bytes) Paircoils (MIT Laboratory for Computer Science, U.S.A.) - (Reference: B. Berger et al. 1995. Proc. Natl. Acad. Sci. USA, 92: 8259-8263) or MultiCoil - is based on the PairCoil algorithm and is used for locating dimeric and trimeric coiled coils.  (Reference: E. Wolf et al. 1997. Protein Sci. 6: 1179-1189).  CoilP

red_bullet.gif (914 bytes) REPPER (REPeats and their PERiodicities) - detects and analyzes regions with short gapless repeats in proteins. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. They are complemented by PSIPRED and coiled coil prediction (COILS), making the server a useful analytical tool for fibrous proteins. (Reference: M. Gruber et al. 2005. Nucl. Acids Res. 33: W239-W243).

Beta-barrel outer membrane proteins: (Test sequence)

red_bullet.gif (914 bytes) PRED-TMßß (Bagos, P. G., et al. Dept Cell Biology & Biophysics, University of Athens, Greece) - employs a Hidden Markov Model method, capable of predicting and discriminating beta-barrel outer membrane proteins. Gives one the opportunity to download a custom image plot or a 2D representation (see below):

red_bullet.gif (914 bytes) BetaTPred2 (Bioinformatics Center, Institute of Microbial Technology, India) - predict ß turns in proteins from multiple alignment by using neural network from the given amino acid sequence. For ß turn prediction, it uses the position specific score matrices generated by PSI-BLAST and secondary structure predicted by PSIPRED. For a classification of the ß turn type use BetaTurns.

red_bullet.gif (914 bytes) BOMP - The ß-barrel outer membrane protein predictor (Reference: Berven, F.S. et al. 2004. Nucl. Acids Res. 32 (Web Server issue):W394-9 ).

red_bullet.gif (914 bytes) HHomp - detection of outer membrane proteins by HMM-HMM comparisons

red_bullet.gif (914 bytes) ConBBPred - Consensus Prediction of Transmembrane Beta-Barrel Proteins - gives one a choice of eight prediction programs.


red_bullet.gif (914 bytes) Scratch Protein Predictor - (Institute for Genomics and Bioinformatics, University California, Irvine) - programs include: ACCpro: the relative solvent accessibility of protein residues; CMAPpro: Prediction of amino acid contact maps; COBEpro: Prediction of continuous B-cell epitopes; CONpro: predicts whether the number of contacts of each residue in a protein is above or below the average for that residue; DIpro: Prediction of disulphide bridges; DISpro: Prediction of disordered regions; DOMpro: Prediction of domains; SSpro: Prediction of protein secondary structure; SVMcon: Prediction of amino acid contact maps using Support Vector Machines; and, 3Dpro: Prediction of protein tertiary structure (Ab Initio).

red_bullet.gif (914 bytes) MESSA: Meta-Server for protein sequence analysis - provides secondary structure (PSIPRED, SSPRO);  coil and loop (DISEMBL) and flexible loop (DISEMBL) analysis, identification of low complexity (SEG) and disordered regions (IsUnstruct, DISOPRED, DISEMBL,DISPRO); transmembrane helices (TMHMM, TOPPRED,HMMTOP, MEMSAT);  TM Helices and signal peptides (MEMSAT_SVM, Phobius); signal peptides  (SignalP HMM Mode, SignalP NN Mode); coiled coils (COILS) and positional conservation. Multiple Sequence Alignment of confident BLAST hits, filtered by less than 90% identity and more than 40% coverage, are used to calculate the positional conservation indices of residues in the sequence. The conserved residues usually play important roles in maintaining the function or structure of a protein. The residues are highlighted from white, through yellow to dark red as the conservation level increases. Function Prediction: This section contains predicted function annotation, GO terms and EC numbers (if the query is an enzyme). A confidence level ("very confident", "confident" or "probable") is provided for each prediction. (Reference: Q. Cong  & N.V. Grishin.  BMC Biology 2012, 10:82).

red_bullet.gif (914 bytes) Quick2D - overview of secondary structure including coiled-cois, transmembrane helices and disordered regions.