RECOVERING SEQUENCES FROM GENBANK

1. Recovering sequence from GenBank

2. Downloading a series of homologs

3. Extracting Features

1. Recovering sequences from GenBank

(i) Formulate a search strategy at NCBI selecting the appropriate "Search" database "Protein" in this case and press "Go":

(ii) The results will look like this:

(iii) If you are interested in the descriptive file and the sequence click on the GenBank  protein accession number (e.g. NP_112054).

(iv) If all you want is the protein sequence change the "Display" to "Fasta" and click on "Display" to get:

(v) Use the Copy/Paste feature of your browser to transfer this data to Notepad.

2. Downloading a series of homologs

If you want to obtain a text file with a variety of homologs to a particular protein one can proceed  as outlined in Section 1 or following a BLASTP search.

Protein sequence:

>gi|9634197|ref|NP_037736.1| Cro protein [Bacteriophage HK97]
MEQRITLKDYAMRFGQTKTAKDLGVYQSAINKAIHAGRNIFLTINADGSVYAEEVKPFPSNKKTTA

(i) Initiate a BLASTP search (ii) Here are the results (trimmed for easy of viewing:





(iii) Click on the boxes to the left of the sequences which you want to recover and the "Get selected sequences" button to get to the following page:

(iv) Again click on the boxes adjacent to the desired sequences and change "Display" to "FASTA" and click on "Display" to give the following:

(v) Use the Copy/Paste feature of your browser to transfer this data to Notepad.

3. FeatureExtract -server extracts sequence and feature annotation, such as intron/exon structure, from GenBank entries and other GenBank format files. (Reference: R. Wernersson. 2005. Nucl. Acids Res. 33: W567-W569).