CONVERT

Several sites are available for conversion of sequence from one format to another.  These include:

 Sequence conversion (Bioinf @ Bugaco) - a huge suite of conversion tools.

 Readseq - developed by D.G. Gilbert (Indiana University) reads and converts biosequences between a selection of common biological sequence formats, including EMBL, GenBank and fasta sequence formats.  It is also available here .

 Sequence Format Converter (Bioinformaticsbox) - interconverts GenBank, EMBL, ACce, BSML, Fasta, PIR, Raw, SwissProt file formats. Also available here .

 Format Converter v2.2.5 - This program takes as input a sequence or sequences (e.g., an alignment) in an unspecified format and converts the sequence(s) to a different user-specified format. Also converts *.gbk to *.gff3.

 Alignment format converter (Biotech Vana SL (Biotechvana), Spain)

GenBank to GFF format converter (ApolloRNA)

red_bullet.gif (914 bytes) JaMBW (European Molecular Biology Laboratory of Heidelberg, Germany). Java based Molecular Biologist's Workbench.Select Chapter 1 for sequence format conversion (upper <---> lower case; T <---> U; reverse or complement sequence). 

 Nucleic Acid Sequence Massager  (Allotron Biosensor Corporation) which in addition to removing spurious material (numbers, breaks, HTML, spaces) changes the format (upper to low case, complement, reverse, RNA to DNA, and triplets). 

 gbk2ptt (A. Villegas, Public Health Agency of Canada) - this will convert a GenBank flat file (*.gbk) to an NCBI Protein Table (*.ptt) file.  The latter is a tab-delineated table of protein features.

 gbk2faa (A. Villegas, Public Health Agency of Canada) - this will convert a GenBank flat file (*.gbk) to a FASTA file including the coding sequences (CDS) translated into amino acids (*.faa).

 gbk2fna (A. Villegas, Public Health Agency of Canada) - this will convert a GenBank flat file (*.gbk) to a FASTA file of the whole genome (a single sequence; *.fna)

 gbk2ffn (P. Konczy, Public Health Agency of Canada) - this will extract from a GenBank flat file (*.gbk) the DNA sequences of each gene which are presented in FASTA format (*.ffn).  The program will also extract the features of your gbk file in EXCEL format (coordinates, strand (+/-), length of gene in nt, gene name, description, and any notes associated with the description.  N.B. this program cannot deal with genes which are designated as follows: 125...250 join 500..725.

 gbk2sqn (A. Villegas, Public Health Agency of Canada) - this will convert a GenBank flat file (*.gbk) to an NCBI Sequin submission (*.sqn) file.  This program was designed to convert data generated in Kodon (Applied Maths, Austin, TX) to Sequin format.  N.B. If using the "Bacterial and Plastid" genetic code please note that the translations of certain CDS will appear /translation="-XXX...." In Sequin select the "Bacterial and Plastid" genetic code and translate to appear /translation="MXXX...."

 Segmenter (C. Laing, Public Health Agency of Canada) - this bit of code is extremely useful it you want to fragment a phage genome into 10-20 kb pieces for BLASTX analysis if looking for framshifts.

 extractUpStreamDNA (A. Villegas, Public Health Agency of Canada, Laboratory for Foodborne Zoonoses) - takes a Genbank flatfile (*.gbk) as input and parses through and for every CDS that it finds, it extracts a pre-determined length of DNA upstream (length will be an argument; and will include 3 nt for the initiation codon). Output will be an FFN file of these upstream DNA sequences.  N.B. this only WORKS for prokaryotic sequences because it does not handle Splits or Joins found in eukaryotic.  This data then can be analyzed with programs such as MEME.

 Convert GenBank to Fasta (G. Rocap, School of Oceanography, University of Washington, U.S.A.) - Select a GenBank formatted file containing a feature table. Select whether to extract translated peptide sequences, DNA sequence for each feature, or the entire DNA sequenceof the whole record. If you chose "Peptide Sequence", your feature table must have "translation"sub-features.

  FeatureExtract - this very useful service extracts sequence and feature annotation, such as intron/exon structure, from GenBank entries and other GenBank format files. (Reference: R. Wernersson.  2005. Nucl. Acids Res. 33 ( Web Server issue): W567-W569).  Also possible is extraction of 5' and 3' sequences.

 Sequence editor - carries out numerous functions:

 Antiparallel - Create the antiparallel DNA or RNA strand. For example the sequence ATGC will be converted into GCAT. It is a combination of the both functions Complement and Inverse.
 Complement - Create the complement DNA or RNA strand. For example the sequence ATGC will be converted into TACG.
 Inverse - Create the inverse DNA or RNA strand. For example the sequence ATGC will be converted into CGTA.
 T to U - Replace all thymidine by uracil. For example the sequence ATUGC will be converted into AUUGC.
 U to T - Replace all uracil by thymidine. For example the sequence ATUGC will be converted into ATTGC.
 UCase - Convert the sequence into upper case.
 LCase - Convert the sequence into lower case.

red_bullet.gif 
  
  
  
  
  
  
  
  
  
  
  (914 bytes) Shuffle DNA and Sequence Randomizer permit one to randomize a sequence to compare with one's own. .