PROTEIN SECONDARY STRUCTURE

Precautionary Quote: "We should be quite remiss not to emphasize that despite the popularity of secondary structural prediction schemes, and the almost ritual performance of these calculations, the information available from this is of limited reliability. This is true even of the best methods now known, and much more so of the less successful methods commonly available in sequence analysis packages. Running a secondary structure prediction on a newly-determined sequence just because everyone else does so, is to be deplored, and the fact that the results of such predictions are generally ignored is insufficient justification for doing and publishing them." Arthur Lesk, 1988.

red_bullet.gif (914 bytes) My favourite site is The Protein Sequence Analysis (PSA) Protein Structure Prediction Server (BMERC at Boston University (BioMolecular Engineering Research Center) predicts probable secondary structures and folding classes for a given amino acid sequence.  Results are available in postscript requiring a viewer such as Ghostview.  A self-extracting version of the later for Windows can be obtained free from the University of Wisconsin. In addition, a new private Web format is available.

red_bullet.gif (914 bytes)  YASPIN secondary structure prediction - is a HNN (Hidden Neural Network) secondary structure prediction program that uses the PSI-BLAST algorithm to produce a PSSM for the input sequence, which it then uses to perform its prediction. (Reference: K. Lin et al. 2005. Bioinformatics 21:152-9).

red_bullet.gif (914 bytes) 1D Protein Structure Prediction Server - is designed for the three types of prediction: Secondary Structure Content Prediction, Structural Class Prediction, and Fold Type Prediction based on a protein sequence.

For a metasite linked to a wide range of protein sequence analysis and structure predictions online programs, I recommend PredictProtein (ROSTLAB, Technische Universität München). Also see: SCRATCH Protein Predictor (Institute for Genomics & Bioinformatics, University of California, Irvine, U.S.A.)

Several great sites for online analysis of potential membrane spanning proteins are: (Test sequence ; see Orientation of Proteins in Membranes for 268 unique a-helical membrane protein structures)

red_bullet.gif (914 bytes) TMpred - Prediction of Trans-membrane Regions and Orientation - ISREC (Swiss Institute for Experimental Cancer Research)
red_bullet.gif (914 bytes) TMHMM - Prediction of transmembrane helices in proteins (Center for Biological Sequence Analysis, The Technical University of Denmark)
red_bullet.gif (914 bytes) DAS - Transmembrane Prediction Server (Stockholm University, Sweden)
red_bullet.gif (914 bytes)
SPLIT (D. Juretic, Univ. Split , Croatia) - the Transmembrane Protein Topology Prediction Server  provides clear and colourful output including beta preference and modified hydrophobic moment index.

red_bullet.gif (914 bytes) OCTOPUS - Using a novel combination of hidden Markov models and artificial neural networks, OCTOPUS predicts the correct topology for 94% of the a dataset of 124 sequences with known structures. (Reference: Viklund, H.  & Elofsson, A. 2008.  Bioinformatics 24: 1662-1668)

red_bullet.gif (914 bytes) Phobius - is a combined transmembrane topology and signal peptide predictor (Reference: L. Käll et al. 2004. J. Mol. Biol. 338: 1027-1036) This tool can also be accessed here and here

red_bullet.gif (914 bytes) SPOCTOPUS  will  also do this.

red_bullet.gif (914 bytes) RHYTHM - predicts the orientation of transmembrane helices in channels and membrane-coils, specifically buried versus exposed residues. (Reference: A. Rose et al. 2009. Nucl. Acids Res. 37(Web Server issue):W575-W580)

red_bullet.gif (914 bytes) TMMOD - Hidden Markov Model for Transmembrane Protein Topology Prediction (Dept. Computer & Information Sciences, University of Delaware, U.S.A.) - on the results page click on "show posterior probabilities" to see a TMHMM-type diagram

red_bullet.gif (914 bytes) PRED-TMR2 (C. Pasquier & S.J.Hamodrakas,Dept. Cell Biology and Biophysics, Univ. Athens, Greece) - when applied to several test sets of transmembrane proteins the system gives a perfect prediction rating of 100% by classifying all the sequences in the transmembrane class. Only 2.5% error rate with nontransmembrane proteins.

red_bullet.gif (914 bytes) TOPCONS - computes consensus predictions of membrane protein topology using a Hidden Markov Model (HMM) and input from five state-of-the-art topology prediction methods. (Reference: A. Bernsel et al. 2009. Nucleic Acids Res. 37(Webserver issue), W465-8) .  For a batch server without BLAST runs use TOPCONSsingle.

red_bullet.gif (914 bytes) MINNOU (Membrane protein IdeNtificatioN withOUt explicit use of hydropathy profiles and alignments) -  predicts alpha-helical as well as beta-sheet transmembrane (TM) domains based on a compact representation of an amino acid residue and its environment, which consists of predicted solvent accessibility and secondary structure of each amino acid. (Reference: Cao et al. 2006. Bioinformatics 22: 303-309). A legend to help interpret the results in here.

red_bullet.gif (914 bytes) SuperLooperprovides the first online interface for the automatic, quick and interactive search and placement of loops in proteins. (Reference: P.W. Hildebrand et al. 2009. Nucl. Acids Res. 37(Web Server issue):W571-W574) )

For drawing the structure of transmembrane proteins two sites are available:

red_bullet.gif (914 bytes)  Protter - an open-source tool for interactive integration and visualization of annotated and predicted protein sequence features together with experimental proteomic evidence. Protter supports numerous proteomic file formats and automatically integrates a variety of reference protein annotation sources, which can be readily extended via modular plug-ins. A built-in export function produces publication-quality customized protein illustrations, also for large datasets. (Reference: U. Omasits et al. 2013. Bioinformatics.  Nov 21. (doi: 10.1093/bioinformatics/btt607). Diagram of the holin from bacteriophage lambda generated with Protter:

red_bullet.gif (914 bytes) TOPO2 (S. Johns, UCSF Sequence Analysis Consulting Service, U.S.A.) - this site provides considerable control over the presentation. Extensive documentation is provided here.

red_bullet.gif (914 bytes) RbDe (F.Campagne,  Inst. Computational Biomedicine, Weill Medical College of Cornell University, New York, U.S.A.)  - also permits one to prepare useful diagrams of transmembrane proteins.
 

 

TMRPres2D (TransMembrane protein Re-Presentation in 2 Dimensions tool) - this Java tool takes data from a variety of protein folding servers and creates uniform, two-dimensional, high analysis graphical images/ models of alpha-helical or beta-barrel transmembrane proteins. (Reference: I.C. Spyropoulos et al. 2004. Bioinformatics 20: 3258-3260).

Signal peptide recognition & subcellular localization:

A. Bacterial proteins

red_bullet.gif (914 bytes) SLEP (Surface Locationization Extracellular Protein) - SLEP is a pipeline for predicting the localization of bacterial proteins starting from genome sequences (Fasta formatted). It combines the results of several tools: Glimmer, TMHMM, PRODIV-TMHMM, LipoP, PSortB.

red_bullet.gif (914 bytes) PSORTb (Brinkman Lab, Simon Fraser Univ., Canada) - provides probably the most accurate bacterial protein subcellular localization predictor.  Alternatively use PSORT (Univ. Tokyo, Japan) - a series of programs for the prediction of protein localization sites in cells. Choose programs specific for  for animal, yeast, plant or bacterial ( Gram-negative or Gram- positive) proteins.
red_bullet.gif (914 bytes)
PSLpred - is a SVM based method, predicts 5 major subcellular localization (cytoplasm, inner-membrane, outer- membrane, extracellular, periplasm)  of Gram-negative bacteria. This method includes various SVM modules based on different features of the proteins. The  hybrid approach achieved an overall accuracy of 91%,  which is best among all the existing methods for the subcellular localization of prokaryotic proteins. (Reference: M. Bhasin et al. (2005) Bioinformatics 21: 2522-2524.)
red_bullet.gif (914 bytes) CELLO  subCELlular LOcalization predictive system -  assigns Gram-negative proteins to the cytoplasm , inner membrane, periplasm, outer membrane or extracellular space with overall prediction accuracy of ca. 89% . Also analyzes eukaryotic and Gram-positive proteins. (Reference: C.S. Yu et al. 2004. Protein Sci.  13:1402-1406).
red_bullet.gif (914 bytes) SubLoc - based on SOAP technology, this server/client suite offers a user-friendly interface for searching and predicting protein subcellular location. N.B. It does not does not predict membrane proteins  (Reference: H. Chen et al. 2006. Bioinformatics
22: 376-377).
 

red_bullet.gif (914 bytes) SignalP - predicts the presence and location of signal peptide cleavage sites in Gram-positive, Gram-negative and eukaryotic proteins (Center for Biological Sequence Analysis, The Technical University of Denmark).  For an example of a periplasmic protein use test sequence MalE.
red_bullet.gif (914 bytes) Phobius - is a combined transmembrane topology and signal peptide predictor (Reference: L. Käll et al. 2004. J. Mol. Biol. 338: 1027-1036).
red_bullet.gif (914 bytes) LipoP 1.0 (Center for Biological Sequence Analysis Technical University of Denmark) - allows prediction of where signal peptidases I & II cleavage sites from Gram negative bacteria will cleave a protein.

red_bullet.gif (914 bytes) SecretomeP - produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Reference: J.D. Bendtsen et al. 2005. BMC Microbiology 5: 58).
 

B. Eukaryotic proteins

red_bullet.gif (914 bytes) Protein Prowler Subcellular Localisation Predictor - The subcellular localisation predictor is largely based on TargetP. (Reference: M. Boden & J. Hawkins. 2005. Bioinformatics 21: 2279-2286).
red_bullet.gif (914 bytes) WoLF-PSORT (National Institute of Advanced Science and Technology, Japan)
red_bullet.gif (914 bytes) pTARGET - This method can predict proteins targeted to nine distinct subcellular locations that include cytoplasm, endoplasmic reticulum, extracellular/secreted, Golgi, lysosomes, mitochondria, nucleus, peroxysomes and plasma membrane. Compared with PSORT showed that pTARGET prediction rates are higher by 11–60% in 6 of the 8 locations tested. (Reference: C. Guda & S. Subramaniam. 2005. Bioinformatics 2005 21:3963-3969)
red_bullet.gif (914 bytes)  ProtComp (Softberry, U.S.A.) can be used to predict the subcellular localization for animal/fungal and plant proteins.

red_bullet.gif (914 bytes) SecretomeP - produces ab initio predictions of non-classical i.e. not signal peptide triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Reference: J.D. Bendtsen et al. 2005. BMC Microbiology 5: 58).

Other sites for secondary structure predictions include:

red_bullet.gif (914 bytes) JPred - a consensus method for protein secondary structure prediction based upon PHD, Predator, DSC, NNSSP, Zpred and Mulpred programs (European Bioinformatics Institute, Cambridge, United Kingdom) - My favourite site. N.B. Do not forget to deselect advanced option 1 at bottom of page to obtain maximum information.

red_bullet.gif (914 bytes) YASPIN secondary structure prediction -  a Hidden Neural Network secondary structure prediction program that uses the PSI-BLAST algorithm to produce a Position Specific Scoring Matrix for the input sequence, which it then uses to perform its prediction. It was trained on 2896 structures from the PDB40 database. (Reference: K. Lin et al. 2005. Bioinformatics 21:152-159).

red_bullet.gif (914 bytes) Network Protein Sequence @nalysis at IBCP - (Institut de Biologie et Chemie des Proteines, Lyon, France) - has DSC, GORIV, Predator, SOPMA and Heirarchical Neural Network Method plus older programs.

red_bullet.gif (914 bytes) For a different colourful approach try PSIpred (UCL Bioinformatics Unit, Department of Computer Science, University College London, United Kingdom)For a full range of properties of your protein including hydrophobicity, alpha helix, beta-sheet plots see ProScale (ExPASy, Switzerland)

Disordered states:

Many proteins containing regions that do not form well-defined structures and the following new programs help define these regions:

red_bullet.gif (914 bytes) RONN (Regional Order Neural Network) - (Reference: Z.R. Yang et al. 2005. Bioinformatics 21: 3369-3376).
red_bullet.gif (914 bytes) IUPred - The underlying assumption is that globular proteins are composed of amino acids which have the potential to form a large number of favorable interactions, whereas intrinsically unstructured proteins (IUPs) adopt no stable structure because their amino acid composition does not allow sufficient favorable interactions to form. (Reference: Z. Dosztányi et al. 2005. Bioinformatics 21: 3433-3434).
red_bullet.gif (914 bytes) DISOPRED2 (Reference: J.J. Ward et al. 2004.
J. Molec. Biol. 337: 635-645).
red_bullet.gif 
    
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 (914 bytes) metaPrDOS - is a meta server to predict natively disordered regions of a protein chain from its amino acid sequence. metaPrDOS returns disorder tendency of each residue as prediction results.(Reference: T. Ishida & K. Kinoshita. 2008. Bioinformatics 24: 1344-1348)

red_bullet.gif 
    
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 (914 bytes) MFDp (Multilayered Fusion-based Disorder predictor) - aims to improve over the current disorder predictors.(Reference: M.J. Mizianty et al. 2010. Bioinformatics 26: i489-i496)

red_bullet.gif 
    
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 (914 bytes) MoRFpred - Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. MoRFpred is a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins which identifies all MoRF types (a, ß, coil and complex). (Reference: F.M. Disfani et al. 2012. Bioinformatics 28: i75-i83).

red_bullet.gif (914 bytes) Scooby-domain (Sequence hydrophobicity predicts domains) is a method to identify globular regions in protein sequence that are suitable for structural studies. The Scooby-domain JAVA applet can be used as a tool to visually identify 'foldable' regions in protein sequence. Interesting graphics. (Reference: R.A. George et al. 2005. Nucl. Acids Res. 33: W160-W163).

For estimations on the antigenicity of regions of proteins see:

red_bullet.gif (914 bytes) Antigenicity Plot (JaMBW module) - Given a sequence of amino acids, this program computes and plots the antigenicity along the polypeptide chain, as predicted by the algorithm of Hopp & Woods (1981).
red_bullet.gif (914 bytes) Antigenicity Prediction
(Princeton BioMolecules,Langhorne, PA, U.S.A.) - find antigenic sequences and also reviews them from the point of hydrophobicity, aggregation, and steric hindrance.
red_bullet.gif (914 bytes) EMBOSS Antigenic (EMBOSS package) - this program predicts potentially antigenic regions of a protein sequence, using the method of Kolaskar & Tongaonkar (1990). Also accessible here.

To screen for coiled-coil regions in proteins use:

red_bullet.gif (914 bytes) Coils - Prediction of Coiled Coil Regions in Proteins (Swiss node of EMBnet, Switzerland) - (Reference: A. Lupas et al. 1991 Science 252: 1162-1164).
red_bullet.gif (914 bytes) Paircoils (MIT Laboratory for Computer Science, U.S.A.) - (Reference: B. Berger et al. 1995. Proc. Natl. Acad. Sci. USA, 92: 8259-8263) or MultiCoil - is based on the PairCoil algorithm and is used for locating dimeric and trimeric coiled coils.  (Reference: E. Wolf et al. 1997. Protein Sci. 6: 1179-1189).

red_bullet.gif (914 bytes) REPPER (REPeats and their PERiodicities) - detects and analyzes regions with short gapless repeats in proteins. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. They are complemented by PSIPRED and coiled coil prediction (COILS), making the server a useful analytical tool for fibrous proteins. (Reference: M. Gruber et al. 2005. Nucl. Acids Res. 33: W239-W243).

Domain linkers:

red_bullet.gif (914 bytes) Armadillo Domain Linker Prediction (The Blueprint Initiative, Toronto, Canada) - Proteins are often composed of multiple structural/functional domains. Domain linkers link these domains together and have been found to contain an amino acid signature that is distinct from the structurally compact domains. Using a set of 211 two-domain contiguous proteins, the sensitivity was 56%.

Beta-barrel outer membrane proteins: (Test sequence)

red_bullet.gif (914 bytes) PRED-TMßß (Bagos, P. G., et al. Dept Cell Biology & Biophysics, University of Athens, Greece) - employs a Hidden Markov Model method, capable of predicting and discriminating beta-barrel outer membrane proteins. Gives one the opportunity to download a custom image plot or a 2D representation (see below):

red_bullet.gif (914 bytes) BetaTPred2 (Bioinformatics Center, Institute of Microbial Technology, India) - predict ß turns in proteins from multiple alignment by using neural network from the given amino acid sequence. For ß turn prediction, it uses the position specific score matrices generated by PSI-BLAST and secondary structure predicted by PSIPRED. For a classification of the ß turn type use BetaTurns.

red_bullet.gif (914 bytes) TMB-Hunt - amino acid composition based TransMembrane Barrel-Hunt (A. Garrow, University of Leeds, England) - provides one with a color-coded score (& Evalue) for an individual or a series of proteins. (Reference: A.G. Garrow et al. 2005. Nucl. Acids Res. 33: W193-W197).

red_bullet.gif (914 bytes) TMBETA-NET - Discrimination and Prediction of Transmembrane Beta Strands in Outer Membrane Proteins from amino acid sequence. Presents color-coded TM beta segments and their probabilities (Reference: M.M. Gromiha & M. Suwa. 2005. Bioinformatics 21: 961 - 968).

red_bullet.gif (914 bytes) BOMP - The ß-barrel outer membrane protein predictor (Reference: Berven, F.S. et al. 2004. Nucl. Acids Res. 32(Web Server issue):W394-9 ).

red_bullet.gif (914 bytes) ConBBPred - Consensus Prediction of Transmembrane Beta-Barrel Proteins - gives one a choice of eight prediction programs.

Metasite:

 Scratch Protein Predictor - (Institute for Genomics and Bioinformatics, University California, Irvine) - programs include: ACCpro: the relative solvent accessibility of protein residues; CMAPpro: Prediction of amino acid contact maps; COBEpro: Prediction of continuous B-cell epitopes; CONpro: predicts whether the number of contacts of each residue in a protein is above or below the average for that residue; DIpro: Prediction of disulphide bridges; DISpro: Prediction of disordered regions; DOMpro: Prediction of domains; SSpro: Prediction of protein secondary structure; SVMcon: Prediction of amino acid contact maps using Support Vector Machines; and, 3Dpro: Prediction of protein tertiary structure (Ab Initio).