MATLAB Bioinformatic Toolbox

 Task 1

Search for: Dengue virus type 1 nucleotide in four different databases

genbank, pdb, genpept, embl


Store the data in a file, and then read into the MATLAB software.

Display the source organism for this sequence.

Hint: https://www.mathworks.com/help/bioinfo/ref/genbankread.html


%Read Genbank and save it into Fasta File
getgenbank('AY145123','ToFile','Dengue_Gene.fasta')
s = genbankread('Dengue_Gene.fasta')
s
s.SourceOrganism
%Read PDB and save it into Fasta File
getpdb('6W3M', 'ToFile','Dengue_pdb.pdb')
pdbstruct = pdbread('Dengue_pdb.pdb')
pdbstruct_Model2= pdbread('Dengue_pdb.pdb', 'ModelNum',2)
pdbstruct.Model
ans
pdbstruct_Model2.Model
ans

%Read Genpept and save it into Fasta File
GenPeptData = getgenpept('P17763','ToFile','Dengue_Genpept.fasta')
genpeptread('Dengue_Genpept.fasta')

%Read EBI and save it into Fasta File

EMBLData = getembl('M87512','ToFile','Dengue_embl.fasta')
seqdata = emblread('Dengue_embl.fasta')

Output

s = struct with fields:
                LocusName: 'AY145123'
      LocusSequenceLength: '10705'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: 'RNA'
     LocusGenBankDivision: 'VRL'
    LocusModificationDate: '30-DEC-2002'
               Definition: 'Dengue virus type 1 recombinant clone rDEN1delta30, complete genome.'
                Accession: 'AY145123'
                  Version: 'AY145123.1'
                       GI: ''
                  Project: []
                   DBLink: []
                 Keywords: []
                  Segment: []
                   Source: 'Dengue virus 1'
           SourceOrganism: [3×67 char]
                Reference: {[1×1 struct]  [1×1 struct]}
                  Comment: []
                 Features: [101×74 char]
                      CDS: [1×1 struct]
                 Sequence: 'agttgttagtctacgtggaccgacaagaacagtttcgaatcggaagcttgcttaacgtagttctaacagttttttattagagagcagatctctgatgaacaaccaacggaaaaagacgggtcgaccgtctttcaatatgctgaaacgcgcgagaaaccgcgtgtcaactgtttcacagttggcgaagagattctcaaaaggattgctttcaggccaaggacccatgaaattggtgatggcttttatagcattcctaagatttctagccatacctccaacagcaggaattttggctagatggggctcattcaagaagaatggagcgatcaaagtgttacggggtttcaagaaagaaatctcaaacatgttgaacataatgaacaggaggaaaagatctgtgaccatgctcctcatgctgctgcccacagccctggcgttccatctgaccacccgagggggagagccgcacatgatagttagcaagcaggaaagaggaaaatcacttttgtttaagacctctgcaggtgtcaacatgtgcacccttattgcaatggatttgggagagttatgtgaggacacaatgacctacaaatgcccccggatcactgagacggaaccagatgacgttgactgttggtgcaatgccacggagacatgggtgacctatggaacatgttctcaaactggtgaacaccgacgagacaaacgttccgtcgcactggcaccacacgtagggcttggtctagaaacaagaaccgaaacgtggatgtcctctgaaggcgcttggaaacaaatacaaaaagtggagacctgggctctgagacacccaggattcacggtgatagccctttttctagcacatgccataggaacatccatcacccagaaagggatcatttttattttgctgatgctggtaactccatccatggccatgcggtgcgtgggaataggcaacagagacttcgtggaaggactgtcaggagctacgtgggtggatgtggtactggagcatggaagttgcgtcactaccatggcaaaagacaaaccaacactggacattgaactcttgaagacggaggtcacaaaccctgccgtcctgcgcaaactgtgcattgaagctaaaatatcaaacaccaccaccgattcgagatgtccaacacaaggagaagccacgctggtggaagaacaggacacgaactttgtgtgtcgacgaacgttcgtggacagaggctggggcaatggttgtgggctattcggaaaaggtagcttaataacgtgtgctaagtttaagtgtgtgacaaaactggaaggaaagatagtccaatatgaaaacttaaaatattcagtgatagtcaccgtacacactggagaccagcaccaagttggaaatgagaccacagaacatggaacaactgcaaccataacacctcaagctcccacgtcggaaatacagctgacagactacggagctctaacattggattgttcacctagaacagggctagactttaatgagatggtgttgttgacaatggaaaaaaaatcatggctcgtccacaaacaatggtttctagacttaccactgccttggacctcgggggcttcaacatcccaagagacttggaatagacaagacttgctggtcacatttaagacagctcatgcaaaaaagcaggaagtagtcgtactaggatcacaagaaggagcaatgcacactgcgttgactggagcgacagaaatccaatcgtctggaacgacaacaatttttgcaggacacctgaaatgcagactaaaaatggataaactgactttaaaagggatgtcatatgtaatgtgcacagggtcattcaagttagagaaggaagtggctgagacccagcatggaactgttctagtgcaggttaaatacgaaggaacagatgcaccatgcaagatccccttctcgtcccaagatgagaagggagtaacccagaatgggagattgataacagccaaccccatagtcactgacaaagaaaaaccagtcaacattgaagcggagccaccttttggtgagagctacattgtggtaggagcaggtgaaaaagctttgaaactaagctggttcaagaagggaagcagtatagggaaaatgtttgaagcaactgcccgtggagcacgaaggatggccatcctgggagacactgcatgggacttcggttctataggaggggtgttcacgtctgtgggaaaactgatacaccagatttttgggactgcgtatggagttttgttcagcggtgtttcttggaccatgaagataggaatagggattctgctgacatggctaggattaaactcaaggagcacgtccctttcaatgacgtgtatcgcagttggcatggtcacactgtacctaggagtcatggttcaggcggactcgggatgtgtaatcaactggaaaggcagagaactcaaatgtggaagcggcatttttgtcaccaatgaagtccacacctggacagagcaatataaattccaggccgactcccctaagagactatcagcggccattgggaaggcatgggaggagggtgtgtgtggaattcgatcagccactcgtctcgagaacatcatgtggaagcaaatatcaaatgaattaaaccacatcttacttgaaaatgacatgaaatttacagtggtcgtaggagacgttagtggaatcttggcccaaggaaagaaaatgattaggccacaacccatggaacacaaatactcgtggaaaagctggggaaaagccaaaatcataggagcagatgtacagaataccaccttcatcatcgacggcccaaacaccccagaatgccctgataaccaaagagcatggaacatttgggaagttgaagactatggatttggaattttcacgacaaacatatggttgaaattgcgtgactcctacactcaagtgtgtgaccaccggctaatgtcagctgccatcaaggatagcaaagcagtccatgctgacatggggtactggatagaaagtgaaaagaacgagacttggaagttggcaagagcctccttcatagaagttaagacatgcatctggccaaaatcccacactctatggagcaatggagtcctggaaagtgagatgataatcccaaagatatatggaggaccaatatctcagcacaactacagaccaggatatttcacacaaacagcagggccgtggcacttgggcaagttagaactagattttgatttatgtgaaggtaccactgttgttgtggatgaacattgtggaaatcgaggaccatctcttagaaccacaacagtcacaggaaagacaatccatgaatggtgctgtagatcttgcacgttaccccccctacgtttcaaaggagaagacgggtgctggtacggcatggaaatcagaccagtcaaggagaaggaagagaacctagttaagtcaatggtctctgcagggtcaggagaagtggacagtttttcactaggactgctatgcatatcaataatgatcgaagaggtaatgagatccagatggagcagaaaaatgctgatgactggaacattggctgtgttcctccttctcacaatgggacaattgacatggaatgatctgatcaggctatgtatcatggttggagccaacgcttcagacaagatggggatgggaacaacgtacctagctttgatggccactttcagaatgagaccaatgttcgcagtcgggctactgtttcgcagattaacatctagagaagttcttcttcttacagttggattgagtctggtggcatctgtagaactaccaaattccttagaggagctaggggatggacttgcaatgggcatcatgatgttgaaattactgactgattttcagtcacatcagctatgggctaccttgctgtctttaacatttgtcaaaacaactttttcattgcactatgcatggaagacaatggctatgatactgtcaattgtatctctcttccctttatgcctgtccacgacttctcaaaaaacaacatggcttccggtgttgctgggatctcttggatgcaaaccactaaccatgtttcttataacagaaaacaaaatctggggaaggaaaagctggcctctcaatgaaggaattatggctgttggaatagttagcattcttctaagttcacttctcaagaatgatgtgccactagctggcccactaatagctggaggcatgctaatagcatgttatgtcatatctggaagctcggccgatttatcactggagaaagcggctgaggtctcctgggaagaagaagcagaacactctggtgcctcacacaacatactagtggaggtccaagatgatggaaccatgaaaataaaggatgaagagagagatgacacactcaccattctcctcaaagcaactctgctagcaatctcaggggtatacccaatgtcaataccggcgaccctctttgtgtggtatttttggcagaaaaagaaacagagatcaggagtgctatgggacacacccagccctccagaagtggaaagagcagtccttgatgatggcatttatagaattctccaaagaggattgttgggcaggtctcaagtaggagtaggagtttttcaagaaggcgtgttccacacaatgtggcacgtcaccaggggagctgtcctcatgtaccaagggaagagactggaaccaagttgggccagtgtcaaaaaagacttgatctcatatggaggaggttggaggtttcaaggatcctggaacgcgggagaagaagtgcaggtgattgctgttgaaccggggaagaaccccaaaaatgtacagacagcgccgggtaccttcaagacccctgaaggcgaagttggagccatagctctagactttaaacccggcacatctggatctcctatcgtgaacagagagggaaaaatagtaggtctttatggaaatggagtggtgacaacaagtggtacctacgtcagtgccatagctcaagctaaagcatcacaagaagggcctctaccagagattgaggacgaggtgtttaggaaaagaaacttaacaataatggacctacatccaggatcgggaaaaacaagaagataccttccagccatagtccgtgaggccataaaaagaaagctgcgcacgctagtcttagctcccacaagagttgtcgcttctgaaatggcagaggcgctcaagggaatgccaataaggtatcagacaacagcagtgaagagtgaacacacgggaaaggagatagttgaccttatgtgtcacgccactttcactatgcgtctcctgtctcctgtgagagttcccaattataatatgattatcatggatgaagcacatttcaccgatccagccagcatagcagccagagggtatatctcaacccgagtgggtatgggtgaagcagctgcgattttcatgacagccactccccccggatcggtggaggcctttccacagagcaatgcagttatccaagatgaggaaagagacattcctgaaagatcatggaactcaggctatgactggatcactgatttcccaggtaaaacagtctggtttgttccaagcatcaaatcaggaaatgacattgccaactgtttaagaaagaatgggaaacgggtggtccaattgagcagaaaaacttttgacactgagtaccagaaaacaaaaaataacgactgggactatgttgtcacaacagacatatccgaaatgggagcaaacttccgagccgacagggtaatagacccgaggcggtgcctgaaaccggtaatactaaaagatggcccagagcgtgtcattctagccggaccgatgccagtgactgtggctagcgccgcccagaggagaggaagaattggaaggaaccaaaataaggaaggcgatcagtatatttacatgggacagcctctaaaaaatgatgaggaccacgcccattggacagaagcaaaaatgctccttgacaacataaacacaccagaagggattatcccagccctctttgagccggagagagaaaagagtgcagcaatagacggggaatacagactacggggtgaagcgaggaaaacgttcgtggagctcatgagaagaggagatctacctgtctggctatcctacaaagttgcctcagaaggcttccagtactccgacagaaggtggtgctttgatggggaaaggaacaaccaggtgttggaggagaacatggacgtggagatctggacaaaagaaggagaaagaaagaaactacgaccccgctggctggatgccagaacatactctgacccactggctctgcgcgaattcaaagagttcgcagcaggaagaagaagcgtctcaggtgacctaatattagaaatagggaaacttccacaacatttaacgcaaagggcccagaacgccttggacaatctggttatgttgcacaactctgaacaaggaggaaaagcctatagacacgccatggaagaactaccagacaccatagaaacgttaatgctcctagctttgatagctgtgctgactggtggagtgacgttgttcttcctatcaggaaggggtctaggaaaaacatccattggcctactctgcgtgattgcctcaagtgcactgttatggatggccagtgtggaaccccattggatagcggcctctatcatactggagttctttctgatggtgttgcttattccagagccggacagacagcgcactccacaagacaaccagctagcatacgtggtgataggtctgttattcatgatattgacagtggcagccaatgagatgggattactggaaaccacaaagaaggacctggggattggtcatgcagctgctgaaaaccaccatcatgctgcaatgctggacgtagacctacatccagcttcagcctggactctctatgcagtggccacaacaattatcactcccatgatgagacacacaattgaaaacacaacggcaaatatttccctgacagctattgcaaaccaggcagctatattgatgggacttgacaagggatggccaatatcaaagatggacataggagttccacttctcgccttggggtgctattctcaggtgaacccgctgacgctgacagcggcggtatttatgctagtggctcattatgccataattggacccggactgcaagcaaaagctactagagaagctcaaaaaaggacagcagccggaataatgaaaaacccaactgtcgacgggatcgttgcaatagatttggaccctgtggtttacgatgcaaaatttgaaaaacagctaggccaaataatgttgttgatactttgcacatcacagatcctcctgatgcggaccacatgggccttgtgtgaatccatcacactagccactggacctctgaccacgctttgggagggatctccaggaaaattctggaacaccacgatagcggtgtccatggcaaacatttttaggggaagttatctagcaggagcaggtctggccttttcattaatgaaatctctaggaggaggtaggagaggcacgggagcccaaggggaaacactgggagaaaaatggaaaagacagctaaaccaattgagcaagtcagaattcaacacttacaaaaggagtgggattatagaggtggatagatctgaagccaaagaggggttaaaaagaggagaaacgactaaacacgcagtgtcgagaggaacggccaaactgaggtggtttgtggagaggaaccttgtgaaaccagaagggaaagtcatagacctcggttgtggaagaggtggctggtcatattattgcgctgggctgaagaaagtcacagaagtgaaaggatacacgaaaggaggacctggacatgaggaaccaatcccaatggcaacctatggatggaacctagtaaagctatactccgggaaagatgtattctttacaccacctgagaaatgtgacaccctcttgtgtgatattggtgagtcctctccgaacccaactatagaagaaggaagaacgttacgtgttctaaagatggtggaaccatggctcagaggaaaccaattttgcataaaaattctaaatccctatatgccgagtgtggtagaaactttggagcaaatgcaaagaaaacatggaggaatgctagtgcgaaatccactctcaagaaactccactcatgaaatgtactgggtttcatgtggaacaggaaacattgtgtcagcagtaaacatgacatctagaatgctgctaaatcgattcacaatggctcacaggaagccaacatatgaaagagacgtggacttaggcgctggaacaagacatgtggcagtagaaccagaggtggccaacctagatatcattggccagaggatagagaatataaaaaatgaacacaaatcaacatggcattatgatgaggacaatccatacaaaacatgggcctatcatggatcatatgaggtcaagccatcaggatcagcctcatccatggtcaatggtgtggtgagactgctaaccaaaccatgggatgtcattcccatggtcacacaaatagccatgactgacaccacaccctttggacaacagagggtgtttaaagagaaagttgacacgcgtacaccaaaagcgaaacgaggcacagcacaaattatggaggtgacagccaggtggttatggggttttctctctagaaacaaaaaacccagaatctgcacaagagaggagttcacaagaaaagtcaggtcaaacgcagctattggagcagtgttcgttgatgaaaatcaatggaactcagcaaaagaggcagtggaagatgaacggttctgggaccttgtgcacagagagagggagcttcataaacaaggaaaatgtgccacgtgtgtctacaacatgatgggaaagagagagaaaaaattaggagagttcggaaaggcaaaaggaagtcgcgcaatatggtacatgtggttgggagcgcgctttttagagtttgaagcccttggtttcatgaatgaagatcactggttcagcagagagaattcactcagtggagtggaaggagaaggactccacaaacttggatacatactcagagacatatcaaagattccagggggaaatatgtatgcagatgacacagccggatgggacacaagaataacagaggatgatcttcagaatgaggccaaaatcactgacatcatggaacctgaacatgccctattggccacgtcaatctttaagctaacctaccaaaacaaggtagtaagggtgcagagaccagcgaaaaatggaaccgtgatggatgtcatatccagacgtgaccagagaggaagtggacaggttggaacctatggcttaaacaccttcaccaacatggaggcccaactaataagacaaatggagtctgagggaatcttttcacccagcgaattggaaaccccaaatctagccgaaagagtcctcgactggttgaaaaaacatggcaccgagaggctgaaaagaatggcaatcagtggagatgactgtgtggtgaaaccaattgatgacagatttgcaacagccttaacagctttgaatgacatgggaaaggtaagaaaagacataccgcaatgggaaccttcaaaaggatggaatgattggcaacaagtgcctttctgttcacaccatttccaccagctgattatgaaggatgggagggagatagtggtgccatgccgcaaccaagatgaacttgtaggtagggccagagtatcacaaggcgccggatggagcttgagagaaactgcatgcctaggcaagtcatatgcacaaatgtggcagctgatgtacttccacaggagagacttgagattagcggctaatgctatctgttcagccgttccagttgattgggtcccaaccagccgtaccacctggtcgatccatgcccaccatcaatggatgacaacagaagacatgttg…'

s = struct with fields:
                LocusName: 'AY145123'
      LocusSequenceLength: '10705'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: 'RNA'
     LocusGenBankDivision: 'VRL'
    LocusModificationDate: '30-DEC-2002'
               Definition: 'Dengue virus type 1 recombinant clone rDEN1delta30, complete genome.'
                Accession: 'AY145123'
                  Version: 'AY145123.1'
                       GI: ''
                  Project: []
                   DBLink: []
                 Keywords: []
                  Segment: []
                   Source: 'Dengue virus 1'
           SourceOrganism: [3×67 char]
                Reference: {[1×1 struct]  [1×1 struct]}
                  Comment: []
                 Features: [101×74 char]
                      CDS: [1×1 struct]
                 Sequence: 'agttgttagtctacgtggaccgacaagaacagtttcgaatcggaagcttgcttaacgtagttctaacagttttttattagagagcagatctctgatgaacaaccaacggaaaaagacgggtcgaccgtctttcaatatgctgaaacgcgcgagaaaccgcgtgtcaactgtttcacagttggcgaagagattctcaaaaggattgctttcaggccaaggacccatgaaattggtgatggcttttatagcattcctaagatttctagccatacctccaacagcaggaattttggctagatggggctcattcaagaagaatggagcgatcaaagtgttacggggtttcaagaaagaaatctcaaacatgttgaacataatgaacaggaggaaaagatctgtgaccatgctcctcatgctgctgcccacagccctggcgttccatctgaccacccgagggggagagccgcacatgatagttagcaagcaggaaagaggaaaatcacttttgtttaagacctctgcaggtgtcaacatgtgcacccttattgcaatggatttgggagagttatgtgaggacacaatgacctacaaatgcccccggatcactgagacggaaccagatgacgttgactgttggtgcaatgccacggagacatgggtgacctatggaacatgttctcaaactggtgaacaccgacgagacaaacgttccgtcgcactggcaccacacgtagggcttggtctagaaacaagaaccgaaacgtggatgtcctctgaaggcgcttggaaacaaatacaaaaagtggagacctgggctctgagacacccaggattcacggtgatagccctttttctagcacatgccataggaacatccatcacccagaaagggatcatttttattttgctgatgctggtaactccatccatggccatgcggtgcgtgggaataggcaacagagacttcgtggaaggactgtcaggagctacgtgggtggatgtggtactggagcatggaagttgcgtcactaccatggcaaaagacaaaccaacactggacattgaactcttgaagacggaggtcacaaaccctgccgtcctgcgcaaactgtgcattgaagctaaaatatcaaacaccaccaccgattcgagatgtccaacacaaggagaagccacgctggtggaagaacaggacacgaactttgtgtgtcgacgaacgttcgtggacagaggctggggcaatggttgtgggctattcggaaaaggtagcttaataacgtgtgctaagtttaagtgtgtgacaaaactggaaggaaagatagtccaatatgaaaacttaaaatattcagtgatagtcaccgtacacactggagaccagcaccaagttggaaatgagaccacagaacatggaacaactgcaaccataacacctcaagctcccacgtcggaaatacagctgacagactacggagctctaacattggattgttcacctagaacagggctagactttaatgagatggtgttgttgacaatggaaaaaaaatcatggctcgtccacaaacaatggtttctagacttaccactgccttggacctcgggggcttcaacatcccaagagacttggaatagacaagacttgctggtcacatttaagacagctcatgcaaaaaagcaggaagtagtcgtactaggatcacaagaaggagcaatgcacactgcgttgactggagcgacagaaatccaatcgtctggaacgacaacaatttttgcaggacacctgaaatgcagactaaaaatggataaactgactttaaaagggatgtcatatgtaatgtgcacagggtcattcaagttagagaaggaagtggctgagacccagcatggaactgttctagtgcaggttaaatacgaaggaacagatgcaccatgcaagatccccttctcgtcccaagatgagaagggagtaacccagaatgggagattgataacagccaaccccatagtcactgacaaagaaaaaccagtcaacattgaagcggagccaccttttggtgagagctacattgtggtaggagcaggtgaaaaagctttgaaactaagctggttcaagaagggaagcagtatagggaaaatgtttgaagcaactgcccgtggagcacgaaggatggccatcctgggagacactgcatgggacttcggttctataggaggggtgttcacgtctgtgggaaaactgatacaccagatttttgggactgcgtatggagttttgttcagcggtgtttcttggaccatgaagataggaatagggattctgctgacatggctaggattaaactcaaggagcacgtccctttcaatgacgtgtatcgcagttggcatggtcacactgtacctaggagtcatggttcaggcggactcgggatgtgtaatcaactggaaaggcagagaactcaaatgtggaagcggcatttttgtcaccaatgaagtccacacctggacagagcaatataaattccaggccgactcccctaagagactatcagcggccattgggaaggcatgggaggagggtgtgtgtggaattcgatcagccactcgtctcgagaacatcatgtggaagcaaatatcaaatgaattaaaccacatcttacttgaaaatgacatgaaatttacagtggtcgtaggagacgttagtggaatcttggcccaaggaaagaaaatgattaggccacaacccatggaacacaaatactcgtggaaaagctggggaaaagccaaaatcataggagcagatgtacagaataccaccttcatcatcgacggcccaaacaccccagaatgccctgataaccaaagagcatggaacatttgggaagttgaagactatggatttggaattttcacgacaaacatatggttgaaattgcgtgactcctacactcaagtgtgtgaccaccggctaatgtcagctgccatcaaggatagcaaagcagtccatgctgacatggggtactggatagaaagtgaaaagaacgagacttggaagttggcaagagcctccttcatagaagttaagacatgcatctggccaaaatcccacactctatggagcaatggagtcctggaaagtgagatgataatcccaaagatatatggaggaccaatatctcagcacaactacagaccaggatatttcacacaaacagcagggccgtggcacttgggcaagttagaactagattttgatttatgtgaaggtaccactgttgttgtggatgaacattgtggaaatcgaggaccatctcttagaaccacaacagtcacaggaaagacaatccatgaatggtgctgtagatcttgcacgttaccccccctacgtttcaaaggagaagacgggtgctggtacggcatggaaatcagaccagtcaaggagaaggaagagaacctagttaagtcaatggtctctgcagggtcaggagaagtggacagtttttcactaggactgctatgcatatcaataatgatcgaagaggtaatgagatccagatggagcagaaaaatgctgatgactggaacattggctgtgttcctccttctcacaatgggacaattgacatggaatgatctgatcaggctatgtatcatggttggagccaacgcttcagacaagatggggatgggaacaacgtacctagctttgatggccactttcagaatgagaccaatgttcgcagtcgggctactgtttcgcagattaacatctagagaagttcttcttcttacagttggattgagtctggtggcatctgtagaactaccaaattccttagaggagctaggggatggacttgcaatgggcatcatgatgttgaaattactgactgattttcagtcacatcagctatgggctaccttgctgtctttaacatttgtcaaaacaactttttcattgcactatgcatggaagacaatggctatgatactgtcaattgtatctctcttccctttatgcctgtccacgacttctcaaaaaacaacatggcttccggtgttgctgggatctcttggatgcaaaccactaaccatgtttcttataacagaaaacaaaatctggggaaggaaaagctggcctctcaatgaaggaattatggctgttggaatagttagcattcttctaagttcacttctcaagaatgatgtgccactagctggcccactaatagctggaggcatgctaatagcatgttatgtcatatctggaagctcggccgatttatcactggagaaagcggctgaggtctcctgggaagaagaagcagaacactctggtgcctcacacaacatactagtggaggtccaagatgatggaaccatgaaaataaaggatgaagagagagatgacacactcaccattctcctcaaagcaactctgctagcaatctcaggggtatacccaatgtcaataccggcgaccctctttgtgtggtatttttggcagaaaaagaaacagagatcaggagtgctatgggacacacccagccctccagaagtggaaagagcagtccttgatgatggcatttatagaattctccaaagaggattgttgggcaggtctcaagtaggagtaggagtttttcaagaaggcgtgttccacacaatgtggcacgtcaccaggggagctgtcctcatgtaccaagggaagagactggaaccaagttgggccagtgtcaaaaaagacttgatctcatatggaggaggttggaggtttcaaggatcctggaacgcgggagaagaagtgcaggtgattgctgttgaaccggggaagaaccccaaaaatgtacagacagcgccgggtaccttcaagacccctgaaggcgaagttggagccatagctctagactttaaacccggcacatctggatctcctatcgtgaacagagagggaaaaatagtaggtctttatggaaatggagtggtgacaacaagtggtacctacgtcagtgccatagctcaagctaaagcatcacaagaagggcctctaccagagattgaggacgaggtgtttaggaaaagaaacttaacaataatggacctacatccaggatcgggaaaaacaagaagataccttccagccatagtccgtgaggccataaaaagaaagctgcgcacgctagtcttagctcccacaagagttgtcgcttctgaaatggcagaggcgctcaagggaatgccaataaggtatcagacaacagcagtgaagagtgaacacacgggaaaggagatagttgaccttatgtgtcacgccactttcactatgcgtctcctgtctcctgtgagagttcccaattataatatgattatcatggatgaagcacatttcaccgatccagccagcatagcagccagagggtatatctcaacccgagtgggtatgggtgaagcagctgcgattttcatgacagccactccccccggatcggtggaggcctttccacagagcaatgcagttatccaagatgaggaaagagacattcctgaaagatcatggaactcaggctatgactggatcactgatttcccaggtaaaacagtctggtttgttccaagcatcaaatcaggaaatgacattgccaactgtttaagaaagaatgggaaacgggtggtccaattgagcagaaaaacttttgacactgagtaccagaaaacaaaaaataacgactgggactatgttgtcacaacagacatatccgaaatgggagcaaacttccgagccgacagggtaatagacccgaggcggtgcctgaaaccggtaatactaaaagatggcccagagcgtgtcattctagccggaccgatgccagtgactgtggctagcgccgcccagaggagaggaagaattggaaggaaccaaaataaggaaggcgatcagtatatttacatgggacagcctctaaaaaatgatgaggaccacgcccattggacagaagcaaaaatgctccttgacaacataaacacaccagaagggattatcccagccctctttgagccggagagagaaaagagtgcagcaatagacggggaatacagactacggggtgaagcgaggaaaacgttcgtggagctcatgagaagaggagatctacctgtctggctatcctacaaagttgcctcagaaggcttccagtactccgacagaaggtggtgctttgatggggaaaggaacaaccaggtgttggaggagaacatggacgtggagatctggacaaaagaaggagaaagaaagaaactacgaccccgctggctggatgccagaacatactctgacccactggctctgcgcgaattcaaagagttcgcagcaggaagaagaagcgtctcaggtgacctaatattagaaatagggaaacttccacaacatttaacgcaaagggcccagaacgccttggacaatctggttatgttgcacaactctgaacaaggaggaaaagcctatagacacgccatggaagaactaccagacaccatagaaacgttaatgctcctagctttgatagctgtgctgactggtggagtgacgttgttcttcctatcaggaaggggtctaggaaaaacatccattggcctactctgcgtgattgcctcaagtgcactgttatggatggccagtgtggaaccccattggatagcggcctctatcatactggagttctttctgatggtgttgcttattccagagccggacagacagcgcactccacaagacaaccagctagcatacgtggtgataggtctgttattcatgatattgacagtggcagccaatgagatgggattactggaaaccacaaagaaggacctggggattggtcatgcagctgctgaaaaccaccatcatgctgcaatgctggacgtagacctacatccagcttcagcctggactctctatgcagtggccacaacaattatcactcccatgatgagacacacaattgaaaacacaacggcaaatatttccctgacagctattgcaaaccaggcagctatattgatgggacttgacaagggatggccaatatcaaagatggacataggagttccacttctcgccttggggtgctattctcaggtgaacccgctgacgctgacagcggcggtatttatgctagtggctcattatgccataattggacccggactgcaagcaaaagctactagagaagctcaaaaaaggacagcagccggaataatgaaaaacccaactgtcgacgggatcgttgcaatagatttggaccctgtggtttacgatgcaaaatttgaaaaacagctaggccaaataatgttgttgatactttgcacatcacagatcctcctgatgcggaccacatgggccttgtgtgaatccatcacactagccactggacctctgaccacgctttgggagggatctccaggaaaattctggaacaccacgatagcggtgtccatggcaaacatttttaggggaagttatctagcaggagcaggtctggccttttcattaatgaaatctctaggaggaggtaggagaggcacgggagcccaaggggaaacactgggagaaaaatggaaaagacagctaaaccaattgagcaagtcagaattcaacacttacaaaaggagtgggattatagaggtggatagatctgaagccaaagaggggttaaaaagaggagaaacgactaaacacgcagtgtcgagaggaacggccaaactgaggtggtttgtggagaggaaccttgtgaaaccagaagggaaagtcatagacctcggttgtggaagaggtggctggtcatattattgcgctgggctgaagaaagtcacagaagtgaaaggatacacgaaaggaggacctggacatgaggaaccaatcccaatggcaacctatggatggaacctagtaaagctatactccgggaaagatgtattctttacaccacctgagaaatgtgacaccctcttgtgtgatattggtgagtcctctccgaacccaactatagaagaaggaagaacgttacgtgttctaaagatggtggaaccatggctcagaggaaaccaattttgcataaaaattctaaatccctatatgccgagtgtggtagaaactttggagcaaatgcaaagaaaacatggaggaatgctagtgcgaaatccactctcaagaaactccactcatgaaatgtactgggtttcatgtggaacaggaaacattgtgtcagcagtaaacatgacatctagaatgctgctaaatcgattcacaatggctcacaggaagccaacatatgaaagagacgtggacttaggcgctggaacaagacatgtggcagtagaaccagaggtggccaacctagatatcattggccagaggatagagaatataaaaaatgaacacaaatcaacatggcattatgatgaggacaatccatacaaaacatgggcctatcatggatcatatgaggtcaagccatcaggatcagcctcatccatggtcaatggtgtggtgagactgctaaccaaaccatgggatgtcattcccatggtcacacaaatagccatgactgacaccacaccctttggacaacagagggtgtttaaagagaaagttgacacgcgtacaccaaaagcgaaacgaggcacagcacaaattatggaggtgacagccaggtggttatggggttttctctctagaaacaaaaaacccagaatctgcacaagagaggagttcacaagaaaagtcaggtcaaacgcagctattggagcagtgttcgttgatgaaaatcaatggaactcagcaaaagaggcagtggaagatgaacggttctgggaccttgtgcacagagagagggagcttcataaacaaggaaaatgtgccacgtgtgtctacaacatgatgggaaagagagagaaaaaattaggagagttcggaaaggcaaaaggaagtcgcgcaatatggtacatgtggttgggagcgcgctttttagagtttgaagcccttggtttcatgaatgaagatcactggttcagcagagagaattcactcagtggagtggaaggagaaggactccacaaacttggatacatactcagagacatatcaaagattccagggggaaatatgtatgcagatgacacagccggatgggacacaagaataacagaggatgatcttcagaatgaggccaaaatcactgacatcatggaacctgaacatgccctattggccacgtcaatctttaagctaacctaccaaaacaaggtagtaagggtgcagagaccagcgaaaaatggaaccgtgatggatgtcatatccagacgtgaccagagaggaagtggacaggttggaacctatggcttaaacaccttcaccaacatggaggcccaactaataagacaaatggagtctgagggaatcttttcacccagcgaattggaaaccccaaatctagccgaaagagtcctcgactggttgaaaaaacatggcaccgagaggctgaaaagaatggcaatcagtggagatgactgtgtggtgaaaccaattgatgacagatttgcaacagccttaacagctttgaatgacatgggaaaggtaagaaaagacataccgcaatgggaaccttcaaaaggatggaatgattggcaacaagtgcctttctgttcacaccatttccaccagctgattatgaaggatgggagggagatagtggtgccatgccgcaaccaagatgaacttgtaggtagggccagagtatcacaaggcgccggatggagcttgagagaaactgcatgcctaggcaagtcatatgcacaaatgtggcagctgatgtacttccacaggagagacttgagattagcggctaatgctatctgttcagccgttccagttgattgggtcccaaccagccgtaccacctggtcgatccatgcccaccatcaatggatgacaacagaagacatgttg…'

ans = 3×67 char array
    'Dengue virus 1                                                     '
    'Viruses; Riboviria; Orthornavirae; Kitrinoviricota; Flasuviricetes;'
    'Amarillovirales; Flaviviridae; Flavivirus.                         '

            Header: [1×1 struct]
             Title: 'SOLUTION NMR STRUCTURE OF 5'UTR STEM LOOP B IN DENV4 FLAVIVI'
          Compound: [4×23 char]
            Source: [5×58 char]
          Keywords: [2×60 char]
    ExperimentData: 'SOLUTION NMR'
           Authors: [2×60 char]
      RevisionDate: [1×3 struct]
           Journal: [1×1 struct]
           Remark2: [1×1 struct]
           Remark3: [1×1 struct]
           Remark4: [2×59 char]
         Remark100: [3×59 char]
         Remark210: [30×59 char]
         Remark215: [6×59 char]
         Remark300: [6×59 char]
         Remark350: [13×59 char]
         Remark500: [35×59 char]
         Remark900: [4×59 char]
      DBReferences: [1×1 struct]
          Sequence: [1×1 struct]
            Cryst1: [1×1 struct]
           OriginX: [1×3 struct]
             Scale: [1×3 struct]
             Model: [1×10 struct]
            Master: [1×1 struct]

pdbstruct = struct with fields:
            Header: [1×1 struct]
             Title: 'SOLUTION NMR STRUCTURE OF 5'UTR STEM LOOP B IN DENV4 FLAVIVI'
          Compound: [4×23 char]
            Source: [5×58 char]
          Keywords: [2×60 char]
    ExperimentData: 'SOLUTION NMR'
           Authors: [2×60 char]
      RevisionDate: [1×3 struct]
           Journal: [1×1 struct]
           Remark2: [1×1 struct]
           Remark3: [1×1 struct]
           Remark4: [2×59 char]
         Remark100: [3×59 char]
         Remark210: [30×59 char]
         Remark215: [6×59 char]
         Remark300: [6×59 char]
         Remark350: [13×59 char]
         Remark500: [35×59 char]
         Remark900: [4×59 char]
      DBReferences: [1×1 struct]
          Sequence: [1×1 struct]
            Cryst1: [1×1 struct]
           OriginX: [1×3 struct]
             Scale: [1×3 struct]
             Model: [1×10 struct]
            Master: [1×1 struct]

pdbstruct_Model2 = struct with fields:
            Header: [1×1 struct]
             Title: 'SOLUTION NMR STRUCTURE OF 5'UTR STEM LOOP B IN DENV4 FLAVIVI'
          Compound: [4×23 char]
            Source: [5×58 char]
          Keywords: [2×60 char]
    ExperimentData: 'SOLUTION NMR'
           Authors: [2×60 char]
      RevisionDate: [1×3 struct]
           Journal: [1×1 struct]
           Remark2: [1×1 struct]
           Remark3: [1×1 struct]
           Remark4: [2×59 char]
         Remark100: [3×59 char]
         Remark210: [30×59 char]
         Remark215: [6×59 char]
         Remark300: [6×59 char]
         Remark350: [13×59 char]
         Remark500: [35×59 char]
         Remark900: [4×59 char]
      DBReferences: [1×1 struct]
          Sequence: [1×1 struct]
            Cryst1: [1×1 struct]
           OriginX: [1×3 struct]
             Scale: [1×3 struct]
             Model: [1×1 struct]
            Master: [1×1 struct]

ans = 1×10 struct
FieldsMDLSerNoAtomTerminal
111×1320 struct1×1 struct
221×1320 struct1×1 struct
331×1320 struct1×1 struct
441×1320 struct1×1 struct
551×1320 struct1×1 struct
661×1320 struct1×1 struct
771×1320 struct1×1 struct
881×1320 struct1×1 struct
991×1320 struct1×1 struct
10101×1320 struct1×1 struct

ans = 1×10 struct
FieldsMDLSerNoAtomTerminal
111×1320 struct1×1 struct
221×1320 struct1×1 struct
331×1320 struct1×1 struct
441×1320 struct1×1 struct
551×1320 struct1×1 struct
661×1320 struct1×1 struct
771×1320 struct1×1 struct
881×1320 struct1×1 struct
991×1320 struct1×1 struct
10101×1320 struct1×1 struct

ans = struct with fields:
    MDLSerNo: 2
        Atom: [1×1320 struct]
    Terminal: [1×1 struct]

ans = struct with fields:
    MDLSerNo: 2
        Atom: [1×1320 struct]
    Terminal: [1×1 struct]

GenPeptData = struct with fields:
                LocusName: 'POLG_DEN1W'
      LocusSequenceLength: '3392'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: ''
     LocusGenBankDivision: 'VRL'
    LocusModificationDate: '02-DEC-2020'
               Definition: 'RecName: Full=Genome polyprotein; Contains: RecName: Full=Capsid protein C; AltName: Full=Capsid protein; AltName: Full=Core protein; Contains: RecName: Full=Protein prM; AltName: Full=Precursor membrane protein; Contains: RecName: Full=Peptide pr; AltName: Full=Peptide precursor; Contains: RecName: Full=Small envelope protein M; AltName: Full=Matrix protein; Contains: RecName: Full=Envelope protein E; Contains: RecName: Full=Non-structural protein 1; Short=NS1; Contains: RecName: Full=Non-structural protein 2A; Short=NS2A; Contains: RecName: Full=Serine protease subunit NS2B; AltName: Full=Flavivirin protease NS2B regulatory subunit; AltName: Full=Non-structural protein 2B; Contains: RecName: Full=Serine protease NS3; AltName: Full=Flavivirin protease NS3 catalytic subunit; AltName: Full=Non-structural protein 3; Contains: RecName: Full=Non-structural protein 4A; Short=NS4A; Contains: RecName: Full=Peptide 2k; Contains: RecName: Full=Non-structural protein 4B; Short=NS4B; Contains: RecName: Full=RNA-directed RNA polymerase NS5; AltName: Full=Non-structural protein 5.'
                Accession: 'P17763'
                  Version: 'P17763.2'
                       GI: ''
                  Project: []
                   DBLink: ' DBSOURCE    UniProtKB: locus POLG_DEN1W, accession P17763; class: standard. extra accessions:P27910,P89313,P89314 created: Aug 1, 1990. sequence updated: Dec 12, 2006. annotation updated: Dec 2, 2020. xrefs: U88535.1, AAB70694.1, U88536.1, AAB70695.1, M23027.1, AAA42940.1, D00503.1, BAA00395.1, GNWVWP, NP_059433.1, 3J8D_B, 3J8D_F, 3L6P_A, 3LKW_A, 4AL8_C, 4GSX_A, 4GSX_B, 4GT0_A, 4GT0_B, 4LCY_C, 4LCY_J, 4OIG_A, 4OIG_B, 4OIG_D, 4OIG_E, 5VIC_E, 5WJL_C, 5WJL_F, 5WJL_I, 5WKF_C, 5WKF_H xrefs (non-sequence databases): PDBsum:3J8D, PDBsum:3L6P, PDBsum:3LKW, PDBsum:4AL8, PDBsum:4GSX, PDBsum:4GT0, PDBsum:4LCY, PDBsum:4OIG, PDBsum:5VIC, PDBsum:5WJL, PDBsum:5WKF, SMR:P17763, IntAct:P17763, ABCD:P17763, GeneID:5075725, KEGG:vg:5075725, EvolutionaryTrace:P17763, PRO:PR:P17763, Proteomes:UP000002500, GO:0039714, GO:0005576, GO:0044167, GO:0033650, GO:0042025, GO:0044220, GO:0016021, GO:0044385, GO:0019028, GO:0019031, GO:0055036, GO:0005524, GO:0003725, GO:0005216, GO:0046872, GO:0004482, GO:0004483, GO:0046983, GO:0003724, GO:0003968, GO:0004252, GO:0005198, GO:0075512, GO:0039654, GO:0039520, GO:0039707, GO:0051259, GO:0039545, GO:0039564, GO:0039574, GO:0039502, GO:0046762, GO:0039694, GO:0019062, CDD:cd12149, DisProt:DP01929, Gene3D:1.10.10.930, Gene3D:1.10.8.970, Gene3D:1.20.1280.260, Gene3D:2.40.10.10, Gene3D:2.60.260.50, Gene3D:2.60.40.350, Gene3D:2.60.98.10, Gene3D:3.30.387.10, Gene3D:3.30.67.10, InterPro:IPR011492, InterPro:IPR043502, InterPro:IPR000069, InterPro:IPR038302, InterPro:IPR013755, InterPro:IPR001122, InterPro:IPR037172, InterPro:IPR027287, InterPro:IPR026470, InterPro:IPR038345, InterPro:IPR001157, InterPro:IPR000752, InterPro:IPR000487, InterPro:IPR000404, InterPro:IPR001528, InterPro:IPR002535, InterPro:IPR038688, InterPro:IPR000336, InterPro:IPR001850, InterPro:IPR014412, InterPro:IPR011998, InterPro:IPR036253, InterPro:IPR038055, InterPro:IPR013756, InterPro:IPR014001, InterPro:IPR001650, InterPro:IPR014756, InterPro:IPR026490, InterPro:IPR027417, InterPro:IPR009003, InterPro:IPR043504, InterPro:IPR000208, InterPro:IPR007094, InterPro:IPR002877, InterPro:IPR029063, Pfam:PF01003, Pfam:PF07652, Pfam:PF02832, Pfam:PF00869, Pfam:PF01004, Pfam:PF00948, Pfam:PF01005, Pfam:PF01002, Pfam:PF01350, Pfam:PF01349, Pfam:PF00972, Pfam:PF01570, Pfam:PF01728, Pfam:PF00949, PIRSF:PIRSF003817, SMART:SM00487, SMART:SM00490, SUPFAM:SSF101257, SUPFAM:SSF50494, SUPFAM:SSF52540, SUPFAM:SSF53335, SUPFAM:SSF56672, SUPFAM:SSF56983, SUPFAM:SSF81296, TIGRFAMs:TIGR04240, PROSITE:PS51527, PROSITE:PS51528, PROSITE:PS51192, PROSITE:PS51194, PROSITE:PS50507, PROSITE:PS51591'
                 Keywords: [22×79 char]
                  Segment: []
                   Source: 'Dengue virus 1 Nauru/West Pac/1974'
           SourceOrganism: [3×67 char]
                Reference: {1×48 cell}
                  Comment: [304×67 char]
                 Features: [964×74 char]
                      CDS: []
                 Sequence: 'mnnqrkktgrpsfnmlkrarnrvstvsqlakrfskgllsgqgpmklvmafiaflrflaipptagilarwgsfkkngaikvlrgfkkeisnmlnimnrrkrsvtmllmllptalafhlttrggephmivskqergksllfktsagvnmctliamdlgelcedtmtykcpritetepddvdcwcnatetwvtygtcsqtgehrrdkrsvalaphvglgletrtetwmssegawkqiqkvetwalrhpgftvialflahaigtsitqkgiifillmlvtpsmamrcvgignrdfveglsgatwvdvvlehgscvttmakdkptldiellktevtnpavlrklcieakisntttdsrcptqgeatlveeqdtnfvcrrtfvdrgwgngcglfgkgslitcakfkcvtklegkivqyenlkysvivtvhtgdqhqvgnettehgttatitpqaptseiqltdygaltldcsprtgldfnemvlltmekkswlvhkqwfldlplpwtsgastsqetwnrqdllvtfktahakkqevvvlgsqegamhtaltgateiqtsgtttifaghlkcrlkmdkltlkgmsyvmctgsfklekevaetqhgtvlvqvkyegtdapckipfssqdekgvtqngrlitanpivtdkekpvnieaeppfgesyivvgagekalklswfkkgssigkmfeatargarrmailgdtawdfgsiggvftsvgklihqifgtaygvlfsgvswtmkigigilltwlglnsrstslsmtciavgmvtlylgvmvqadsgcvinwkgrelkcgsgifvtnevhtwteqykfqadspkrlsaaigkaweegvcgirsatrlenimwkqisnelnhillendmkftvvvgdvsgilaqgkkmirpqpmehkyswkswgkakiigadvqnttfiidgpntpecpdnqrawniwevedygfgifttniwlklrdsytqvcdhrlmsaaikdskavhadmgywieseknetwklarasfievktciwpkshtlwsngvlesemiipkiyggpisqhnyrpgyftqtagpwhlgkleldfdlcegttvvvdehcgnrgpslrtttvtgktihewccrsctlpplrfkgedgcwygmeirpvkekeenlvksmvsagsgevdsfslgllcisimieevmrsrwsrkmlmtgtlavfllltmgqltwndlirlcimvganasdkmgmgttylalmatfrmrpmfavgllfrrltsrevllltvglslvasvelpnsleelgdglamgimmlklltdfqshqlwatllsltfvkttfslhyawktmamilsivslfplclsttsqkttwlpvllgslgckpltmflitenkiwgrkswplnegimavgivsillssllkndvplagpliaggmliacyvisgssadlslekaaevsweeeaehsgashnilvevqddgtmkikdeerddtltillkatllaisgvypmsipatlfvwyfwqkkkqrsgvlwdtpsppeveravlddgiyrilqrgllgrsqvgvgvfqegvfhtmwhvtrgavlmyqgkrlepswasvkkdlisygggwrfqgswnageevqviavepgknpknvqtapgtfktpegevgaialdfkpgtsgspivnregkivglygngvvttsgtyvsaiaqakasqegplpeiedevfrkrnltimdlhpgsgktrrylpaivreairrnvrtlvlaptrvvasemaealkgmpiryqttavksehtgkeivdlmchatftmrllspvrvpnynmiimdeahftdpasiaargyistrvgmgeaaaifmtatppgsveafpqsnaviqdeerdiperswnsgydwitdfpgktvwfvpsiksgndianclrkngkrvvqlsrktfdteyqktknndwdyvvttdisemganfradrvidprrclkpvilkdgpervilagpmpvtvasaaqrrgrigrnqnkegdqyiymgqplnndedhahwteakmlldnintpegiipalfepereksaaidgeyrlrgearktfvelmrrgdlpvwlsykvasegfqysdrrwcfdgernnqvleenmdveiwtkegerkklrprwldartysdplalrefkefaagrrsvsgdlileigklpqhltqraqnaldnlvmlhnseqggkayrhameelpdtietlmllaliavltggvtlfflsgrglgktsigllcviassallwmasvephwiaasiilefflmvllipepdrqrtpqdnqlayvvigllfmiltaaanemgllettkkdlgighaaaenhhhaamldvdlhpasawtlyavattiitpmmrhtienttanisltaianqaailmgldkgwpiskmdigvpllalgcysqvnpltltaavfmlvahyaiigpglqakatreaqkrtaagimknptvdgivaidldpvvydakfekqlgqimllilctsqillmrttwalcesitlatgplttlwegspgkfwnttiavsmanifrgsylagaglafslmkslgggrrgtgaqgetlgekwkrqlnqlsksefntykrsgiievdrseakeglkrgeptkhavsrgtaklrwfvernlvkpegkvidlgcgrggwsyycaglkkvtevkgytkggpgheepipmatygwnlvklysgkdvfftppekcdtllcdigesspnptieegrtlrvlkmvepwlrgnqfcikilnpympsvvetleqmqrkhggmlvrnplsrnsthemywvscgtgnivsavnmtsrmllnrftmahrkptyerdvdlgagtrhvavepevanldiigqrieniknghkstwhydednpyktwayhgsyevkpsgsassmvngvvrlltkpwdvipmvtqiamtdttpfgqqrvfkekvdtrtpkakrgtaqimevtarwlwgflsrnkkprictreeftrkvrsnaaigavfvdenqwnsakeavederfwdlvhrerelhkqgkcatcvynmmgkrekklgefgkakgsraiwymwlgarflefealgfmnedhwfsrenslsgvegeglhklgyilrdiskipggnmyaddtagwdtriteddlqneakitdimepehallatsifkltyqnkvvrvqrpakngtvmdvisrrdqrgsgqvgtyglntftnmeaqlirqmesegifspseletpnlaervldwlkkhgterlkrmaisgddcvvkpiddrfataltalndmgkvrkdipqwepskgwndwqqvpfcshhfhqlimkdgreivvpcrnqdelvgrarvsqgagwslretaclgksyaqmwqlmyfhrrdlrlaanaicsavpvdwvptsrttwsihahhqwmttedmlsvwnrvwieenpwmedkthvsswedvpylgkredrwcgsligltaratwatniqvainqvrrlignenyldfmtsmkrfknesdpegalw'
                SearchURL: 'https://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&id=P17763'
              RetrieveURL: 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=119364637&rettype=gp&retmode=text'

ans = struct with fields:
                LocusName: 'POLG_DEN1W'
      LocusSequenceLength: '3392'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: ''
     LocusGenBankDivision: 'VRL'
    LocusModificationDate: '02-DEC-2020'
               Definition: 'RecName: Full=Genome polyprotein; Contains: RecName: Full=Capsid protein C; AltName: Full=Capsid protein; AltName: Full=Core protein; Contains: RecName: Full=Protein prM; AltName: Full=Precursor membrane protein; Contains: RecName: Full=Peptide pr; AltName: Full=Peptide precursor; Contains: RecName: Full=Small envelope protein M; AltName: Full=Matrix protein; Contains: RecName: Full=Envelope protein E; Contains: RecName: Full=Non-structural protein 1; Short=NS1; Contains: RecName: Full=Non-structural protein 2A; Short=NS2A; Contains: RecName: Full=Serine protease subunit NS2B; AltName: Full=Flavivirin protease NS2B regulatory subunit; AltName: Full=Non-structural protein 2B; Contains: RecName: Full=Serine protease NS3; AltName: Full=Flavivirin protease NS3 catalytic subunit; AltName: Full=Non-structural protein 3; Contains: RecName: Full=Non-structural protein 4A; Short=NS4A; Contains: RecName: Full=Peptide 2k; Contains: RecName: Full=Non-structural protein 4B; Short=NS4B; Contains: RecName: Full=RNA-directed RNA polymerase NS5; AltName: Full=Non-structural protein 5.'
                Accession: 'P17763'
                  Version: 'P17763.2'
                       GI: ''
                  Project: []
                   DBLink: ' DBSOURCE    UniProtKB: locus POLG_DEN1W, accession P17763; class: standard. extra accessions:P27910,P89313,P89314 created: Aug 1, 1990. sequence updated: Dec 12, 2006. annotation updated: Dec 2, 2020. xrefs: U88535.1, AAB70694.1, U88536.1, AAB70695.1, M23027.1, AAA42940.1, D00503.1, BAA00395.1, GNWVWP, NP_059433.1, 3J8D_B, 3J8D_F, 3L6P_A, 3LKW_A, 4AL8_C, 4GSX_A, 4GSX_B, 4GT0_A, 4GT0_B, 4LCY_C, 4LCY_J, 4OIG_A, 4OIG_B, 4OIG_D, 4OIG_E, 5VIC_E, 5WJL_C, 5WJL_F, 5WJL_I, 5WKF_C, 5WKF_H xrefs (non-sequence databases): PDBsum:3J8D, PDBsum:3L6P, PDBsum:3LKW, PDBsum:4AL8, PDBsum:4GSX, PDBsum:4GT0, PDBsum:4LCY, PDBsum:4OIG, PDBsum:5VIC, PDBsum:5WJL, PDBsum:5WKF, SMR:P17763, IntAct:P17763, ABCD:P17763, GeneID:5075725, KEGG:vg:5075725, EvolutionaryTrace:P17763, PRO:PR:P17763, Proteomes:UP000002500, GO:0039714, GO:0005576, GO:0044167, GO:0033650, GO:0042025, GO:0044220, GO:0016021, GO:0044385, GO:0019028, GO:0019031, GO:0055036, GO:0005524, GO:0003725, GO:0005216, GO:0046872, GO:0004482, GO:0004483, GO:0046983, GO:0003724, GO:0003968, GO:0004252, GO:0005198, GO:0075512, GO:0039654, GO:0039520, GO:0039707, GO:0051259, GO:0039545, GO:0039564, GO:0039574, GO:0039502, GO:0046762, GO:0039694, GO:0019062, CDD:cd12149, DisProt:DP01929, Gene3D:1.10.10.930, Gene3D:1.10.8.970, Gene3D:1.20.1280.260, Gene3D:2.40.10.10, Gene3D:2.60.260.50, Gene3D:2.60.40.350, Gene3D:2.60.98.10, Gene3D:3.30.387.10, Gene3D:3.30.67.10, InterPro:IPR011492, InterPro:IPR043502, InterPro:IPR000069, InterPro:IPR038302, InterPro:IPR013755, InterPro:IPR001122, InterPro:IPR037172, InterPro:IPR027287, InterPro:IPR026470, InterPro:IPR038345, InterPro:IPR001157, InterPro:IPR000752, InterPro:IPR000487, InterPro:IPR000404, InterPro:IPR001528, InterPro:IPR002535, InterPro:IPR038688, InterPro:IPR000336, InterPro:IPR001850, InterPro:IPR014412, InterPro:IPR011998, InterPro:IPR036253, InterPro:IPR038055, InterPro:IPR013756, InterPro:IPR014001, InterPro:IPR001650, InterPro:IPR014756, InterPro:IPR026490, InterPro:IPR027417, InterPro:IPR009003, InterPro:IPR043504, InterPro:IPR000208, InterPro:IPR007094, InterPro:IPR002877, InterPro:IPR029063, Pfam:PF01003, Pfam:PF07652, Pfam:PF02832, Pfam:PF00869, Pfam:PF01004, Pfam:PF00948, Pfam:PF01005, Pfam:PF01002, Pfam:PF01350, Pfam:PF01349, Pfam:PF00972, Pfam:PF01570, Pfam:PF01728, Pfam:PF00949, PIRSF:PIRSF003817, SMART:SM00487, SMART:SM00490, SUPFAM:SSF101257, SUPFAM:SSF50494, SUPFAM:SSF52540, SUPFAM:SSF53335, SUPFAM:SSF56672, SUPFAM:SSF56983, SUPFAM:SSF81296, TIGRFAMs:TIGR04240, PROSITE:PS51527, PROSITE:PS51528, PROSITE:PS51192, PROSITE:PS51194, PROSITE:PS50507, PROSITE:PS51591'
                 Keywords: [22×79 char]
                  Segment: []
                   Source: 'Dengue virus 1 Nauru/West Pac/1974'
           SourceOrganism: [3×67 char]
                Reference: {1×48 cell}
                  Comment: [304×67 char]
                 Features: [964×74 char]
                      CDS: []
                 Sequence: 'mnnqrkktgrpsfnmlkrarnrvstvsqlakrfskgllsgqgpmklvmafiaflrflaipptagilarwgsfkkngaikvlrgfkkeisnmlnimnrrkrsvtmllmllptalafhlttrggephmivskqergksllfktsagvnmctliamdlgelcedtmtykcpritetepddvdcwcnatetwvtygtcsqtgehrrdkrsvalaphvglgletrtetwmssegawkqiqkvetwalrhpgftvialflahaigtsitqkgiifillmlvtpsmamrcvgignrdfveglsgatwvdvvlehgscvttmakdkptldiellktevtnpavlrklcieakisntttdsrcptqgeatlveeqdtnfvcrrtfvdrgwgngcglfgkgslitcakfkcvtklegkivqyenlkysvivtvhtgdqhqvgnettehgttatitpqaptseiqltdygaltldcsprtgldfnemvlltmekkswlvhkqwfldlplpwtsgastsqetwnrqdllvtfktahakkqevvvlgsqegamhtaltgateiqtsgtttifaghlkcrlkmdkltlkgmsyvmctgsfklekevaetqhgtvlvqvkyegtdapckipfssqdekgvtqngrlitanpivtdkekpvnieaeppfgesyivvgagekalklswfkkgssigkmfeatargarrmailgdtawdfgsiggvftsvgklihqifgtaygvlfsgvswtmkigigilltwlglnsrstslsmtciavgmvtlylgvmvqadsgcvinwkgrelkcgsgifvtnevhtwteqykfqadspkrlsaaigkaweegvcgirsatrlenimwkqisnelnhillendmkftvvvgdvsgilaqgkkmirpqpmehkyswkswgkakiigadvqnttfiidgpntpecpdnqrawniwevedygfgifttniwlklrdsytqvcdhrlmsaaikdskavhadmgywieseknetwklarasfievktciwpkshtlwsngvlesemiipkiyggpisqhnyrpgyftqtagpwhlgkleldfdlcegttvvvdehcgnrgpslrtttvtgktihewccrsctlpplrfkgedgcwygmeirpvkekeenlvksmvsagsgevdsfslgllcisimieevmrsrwsrkmlmtgtlavfllltmgqltwndlirlcimvganasdkmgmgttylalmatfrmrpmfavgllfrrltsrevllltvglslvasvelpnsleelgdglamgimmlklltdfqshqlwatllsltfvkttfslhyawktmamilsivslfplclsttsqkttwlpvllgslgckpltmflitenkiwgrkswplnegimavgivsillssllkndvplagpliaggmliacyvisgssadlslekaaevsweeeaehsgashnilvevqddgtmkikdeerddtltillkatllaisgvypmsipatlfvwyfwqkkkqrsgvlwdtpsppeveravlddgiyrilqrgllgrsqvgvgvfqegvfhtmwhvtrgavlmyqgkrlepswasvkkdlisygggwrfqgswnageevqviavepgknpknvqtapgtfktpegevgaialdfkpgtsgspivnregkivglygngvvttsgtyvsaiaqakasqegplpeiedevfrkrnltimdlhpgsgktrrylpaivreairrnvrtlvlaptrvvasemaealkgmpiryqttavksehtgkeivdlmchatftmrllspvrvpnynmiimdeahftdpasiaargyistrvgmgeaaaifmtatppgsveafpqsnaviqdeerdiperswnsgydwitdfpgktvwfvpsiksgndianclrkngkrvvqlsrktfdteyqktknndwdyvvttdisemganfradrvidprrclkpvilkdgpervilagpmpvtvasaaqrrgrigrnqnkegdqyiymgqplnndedhahwteakmlldnintpegiipalfepereksaaidgeyrlrgearktfvelmrrgdlpvwlsykvasegfqysdrrwcfdgernnqvleenmdveiwtkegerkklrprwldartysdplalrefkefaagrrsvsgdlileigklpqhltqraqnaldnlvmlhnseqggkayrhameelpdtietlmllaliavltggvtlfflsgrglgktsigllcviassallwmasvephwiaasiilefflmvllipepdrqrtpqdnqlayvvigllfmiltaaanemgllettkkdlgighaaaenhhhaamldvdlhpasawtlyavattiitpmmrhtienttanisltaianqaailmgldkgwpiskmdigvpllalgcysqvnpltltaavfmlvahyaiigpglqakatreaqkrtaagimknptvdgivaidldpvvydakfekqlgqimllilctsqillmrttwalcesitlatgplttlwegspgkfwnttiavsmanifrgsylagaglafslmkslgggrrgtgaqgetlgekwkrqlnqlsksefntykrsgiievdrseakeglkrgeptkhavsrgtaklrwfvernlvkpegkvidlgcgrggwsyycaglkkvtevkgytkggpgheepipmatygwnlvklysgkdvfftppekcdtllcdigesspnptieegrtlrvlkmvepwlrgnqfcikilnpympsvvetleqmqrkhggmlvrnplsrnsthemywvscgtgnivsavnmtsrmllnrftmahrkptyerdvdlgagtrhvavepevanldiigqrieniknghkstwhydednpyktwayhgsyevkpsgsassmvngvvrlltkpwdvipmvtqiamtdttpfgqqrvfkekvdtrtpkakrgtaqimevtarwlwgflsrnkkprictreeftrkvrsnaaigavfvdenqwnsakeavederfwdlvhrerelhkqgkcatcvynmmgkrekklgefgkakgsraiwymwlgarflefealgfmnedhwfsrenslsgvegeglhklgyilrdiskipggnmyaddtagwdtriteddlqneakitdimepehallatsifkltyqnkvvrvqrpakngtvmdvisrrdqrgsgqvgtyglntftnmeaqlirqmesegifspseletpnlaervldwlkkhgterlkrmaisgddcvvkpiddrfataltalndmgkvrkdipqwepskgwndwqqvpfcshhfhqlimkdgreivvpcrnqdelvgrarvsqgagwslretaclgksyaqmwqlmyfhrrdlrlaanaicsavpvdwvptsrttwsihahhqwmttedmlsvwnrvwieenpwmedkthvsswedvpylgkredrwcgsligltaratwatniqvainqvrrlignenyldfmtsmkrfknesdpegalw'

EMBLData = struct with fields:
            Identification: [1×1 struct]
                 Accession: 'M87512'
           SequenceVersion: 'M87512.1'
               DateCreated: '03-JUN-1992 (Rel. 32, Created)'
               DateUpdated: '04-MAR-2000 (Rel. 63, Last updated, Version 3)'
               Description: 'Dengue virus type 1 complete genome.'
                   Keyword: 'complete genome.'
           OrganismSpecies: 'Dengue virus 1'
    OrganismClassification: 'Viruses; Riboviria; Flaviviridae; Flavivirus.'
                 Organelle: ''
                 Reference: {[1×1 struct]}
    DatabaseCrossReference: [56×50 char]
                  Comments: [2×61 char]
                  Assembly: ''
                   Feature: [7×50 char]
                 BaseCount: [1×1 struct]
                  Sequence: 'gtggaccgcaaagaacagtttcgaatcggaagcttgcttaacgtagttctaacagttttttattagagagcagatctctgatgaacaaccaacgaaaaaagacggctcgaccgtctttcaatatgctgaaacgcgcgagaaaccgcgtgtcaactggttcacagttggcgaagagattctcaaaaggattgctttcaggccaaggacccatgaaattggtgatggctttcatagcattcctaagatttctagccatacccccaacagcaggaattttggctagatggggctcattcaagaagaatggagcgatcaaagtgctacggggtttcaagaaagaaatctcaaacatgttgaacataatgaatagaaggaaaagatctgtgaccatgctcctcatgctgctgcccacagccttggcgttccatttgactacacgagggggagagccacacatgatagttagcaagcaggaaagagaaaagtcactcttgtttaagacctctgtaggtgtcaacatgtgcacccttatagcgatggatttgggagagttatgtgaggacacaatgacttacaaatgccctcgaattactgaggcggaaccagatgacgttgattgttggtgcaatgctacagacacatgggtgacctatggaacatgttcccaaactggcgagcaccgacgggacaaacgttccgtcgcactggccccacacgtgggacttggtctagaaacaagaaccgaaacgtggatgtcctctgaaggcgcttggaaacaaatacaaagagtggagacttgggctttgcgacacccaggattcacggtgatagccctttttcttgcacatgccataggaacatccatcactcagaaagggattattttcattttgttaatgctagtaacaccatccatggccatgcgatgcgtgggaataggcagcagggacttcgtggaaggactatcaggagcaacttgggtagacgtggtactggaacatggaagttgcgtcaccaccatggcaaaagacaaaccaacattggacattgaactcctgaaaacggaggtcacgaaccctgccgtcctgcgcaaactgtgcattgaagctaaaatatcaaacaccaccaccgattcaagatgtccaacacaaggagaagctacactggtggaagaacaagacgcgaactttgtgtgtcgacgaacgttcgtggacagaggctggggtaatggctgcggactatttggaaaaggaagcctactgacgtgtgctaagttcaagtgtgtgacaaaactagaaggaaagatagttcaatatgaaaacttaaaatattcagtgatagtcactgtccacactggggaccagcaccaggtgggaaacgagactacagaacatggaacaattgcaaccataacacctcaagctcctacgtcggaaatacagctgaccgactacggagccctcacattggactgctcacctagaactgggctggactttaatgagatggtgctattgacaatgaaagaaaaatcatggcttgttcacaaacaatggtttctagacttaccactgccttggacttcgggggcttcaacatcccaagagacttggaacagacaagatttgctggtcacattcaagacagctcatgcaaagaagcaggaagtagtcgtactgggatcacaggaaggagcaatgcacactgcgttgactggggcgacagaaatccaaacgtctggaacgacaacaatttttgcaggacacctgaaatgtagactaaaaatggacaaactgactctaaaagggatgtcatatgtgatgtgcacaggctcatttaagctagagaaggaagtggctgagacccagcatggaactgttttagtgcaggttaaatacgaaggaacagatgcaccatgcaagatccccttttcgacccaagatgagaaaggagtgacccagaatagattgataacagccaatcctatagttactgacaaagaaaaaccagtcaacattgagacagaaccaccttttggtgagagctacatcgtggtaggggcaggtgaaaaagctttgaaacaatgctggttcaagaaaggaagcagcatagggaaaatgttcgaagcaaccgcccgaggagcacgaaggatggctatcctgggagacaccgcatgggacttcggttctataggaggagtgttcacgtctgtgggaaaattagtgcatcaggtttttggaaccgcatatggggttctgttcagcggtgtttcttggaccatgaaaataggaatagggattctgctgacatggttgggattaaattcaaggagcacgtcactttcgatgacgtgcattgcagttggcatggtcacactgtacctaggagtcatggttcaagcggactcgggatgtgtaatcaactggaagggcagagaactcaaatgtggaagtggcatttttgtcactaatgaagtccacacttggacagagcaatacaaatttcaagctgactccccaaaaagactatcagcagccatcggaaaggcatgggaggagggtgtgtgtggaattcgatcagccactcgtctcgagaacatcatgtggaagcaaatatcaaatgaactgaaccacatcttacttgaaaatgacatgaaattcacagtggttgtaggagatgttgttgggatcttggcccaagggaaaaaaatgattagaccacaacccatggaacacaaatactcatggaaaagctggggaaaagccaaaatcataggagcagacatacagaacaccaccttcatcattgacggcccagatactccagaatgtcctgatgaccaaagagcatggaacatttgggaagttgaggactatgggttcggaattttcacgacaaacatatggttgaaattgcgtgactcctacacccaaatgtgtgaccaccggctaatgtcagctgccatcaaggacagcaaggcagtccatgctgatatggggtactggatagaaagtgaaaagaacgagacctggaagctggcaagagcctctttcatagaagttaaaacatgtgtctggccaaaatcccacactctatggagcaatggagttctggaaagtgaaatgataattccaaagatctatggaggaccaatatctcagcacaactacagaccaggatatttcacacaaacggcagggccatggcacctaggcaagttggaactggattttgatttgtgtgagggtaccacagttgttgtggatgaacattgtggaaatcgaggtccatctcttagaaccacaacagtcacaggaaagataattcatgaatggtgttgcagatcttgtacgctaccacccttacgtttcaaaggagaagatggatgttggtacggtatggaaatcagaccagtcaaggaaaaggaagagaatctagtcaaatcaatggtctctgcagggtcaggggaagtggacagcttttcactaggactgctatgcatatcaataatgatcgaagaggtgatgagatccagatggagcagaaaaatgctgatgactggaacactggctgtgttcctccttctcataatgggacaattgacatggaatgatctgatcaggttatgcatcatggttggagccaatgcttcagacaggatggggatgggaacaacgtacctagctctgatggccacttttaaaatgagaccaatgtttgctgtcgggctgttgttccgcagactaacatctagagaagttcttcttcttacaattggattgagtctagtggcatctgtggagttaccaaattccctggaggagctgggggatggacttgcaatgggcattatgattttaaaattattgactgactttcagtcacatcagctgtgggctaccttgctgtccttgacatttgtcaaaacaacgttttccttgcactatgcatggaagacaatggctatggtactgtcaattgtatctctcttccccttatgcctgtccacgacctcccaaaaaacaacatggcttccggtgctattgggatctcttggatgcaaaccactaaccatgtttctcatagcagaaaacaaaatctggggaaggaaaagttggcccctcaatgaaggaatcatggctgttggaatagtcagcatcctactaagttcactcctcaaaaatgatgtgccgctagctgggccactaatagctggaggcatgctaatagcatgttacgttatatctggaagctcagccgacttatcactagagaaagcggctgaggtctcctgggaagaagaagcagaacactctggtgcctcacacaatatattagtggaggtccaagatgatggaaccatgaagataaaagatgaagagagagatgacacgctaaccattctccttaaagcaaccctgctagcagtttcaggggtgtacccattatcaataccagcaaccctttttgtgtggtacttttggcagaaaaagaaacaaagatctggagtgttatgggacacacctagccctccagaagtggaaagagcagtccttgatgatggtatctatagaattatgcagagaggactgttgggcaggtcccaagtaggagtgggagttttccaagacggcgtgttccacacaatgtggcacgtcaccaggggagctgtccttatgtaccaagggaagaggctggaaccaagctgggccagtgtcaaaaaagacttgatctcatatggaggaggttggaggtttcaaggatcctggaacacgggagaagaagtgcaggtgattgctgttgaaccaggaaaaaaccccaaaaatgtacagacagcgccgggtaccttcaagacccctgaaggtgaagttggagctattgccctagattttaaacccggcacatctggatctcccatcgtgaacagagaaggaaaaatagtaggtctttatggaaatggagtagtgacaacaagtggaacctacgtcagtgccatagcccaagccaaagcatcacaagaagggcccctaccagagattgaggacgaggtgtttaggaaaagaaacttaacaataatggacctacatccaggatcggggaaaacaagaagatatcttccagccatagtccgtgaggccataagaaggaacgtgcgcacactaattttggctcccacaagggttgtcgcttccgaaatggcagaggcgctcaagggaatgccaataaggtaccaaacaacagcagtgaagagtgaacacacaggaaaagagatagttgacctcatgtgtcacgccactttcaccatgcgtctcctgtctcccgtgagagttcccaattacaacatgattatcatggatgaagcacattttaccgatccagccagcatagcgcgcagagggtacatctcaacccgagtgggcatgggtgaagcagctgcgatcttcatgacagccactcccccaggatcggtggaggcctttccacagagcaatgcagttatccaagatgaggaaagagacattcctgagagatcatggaactcaggctatgagtggatcactgacttcccaggtaaaacagtctggtttgttccaagcatcaaatcaggaaatgacattgccaactgcttaagaaagaatgggaaacgggtgattcaattgagcaggaaaacctttgatacagagtaccaaaaaacaaaaaacaacgactgggactatgtcgtcacaacagatatctccgaaatgggagcaaacttccgagccgacagggtgatagacccaagacggtgtctgaaaccggtaatactaaaagatggtccagagcgcgtcattctagccggaccgatgccagtgactgtggccagtgctgcccagaggagaggaagaattggaaggaaccaaaacaaagaaggtgatcagtacgtttacatgggacagcctttaaataatgatgaggatcacgctcattggacagaagcaaaaatgctccttgacaatataaacacaccagaagggatcatcccagccctctttgagccagagagagaaaagagtgcagcaatagacggggagtacagactgcggggagaagcaagaaaaacgtttgtggagctcatgagaagaggagatctacctgtctggctatcctacaaagttgcctcagaaggcttccagtactctgacagaagatggtgctttgacggggaaaggaacaaccaggtgttggaggagaacatggacgtggagatgtggacaaaagaaggagaacgaaagaaactacgaccccgctggctggatgccagaacatactcagacccactggccctgcgcgagtttaaagagtttgcagcaggaagaagaagtgtctcaggtgatctaatattagaaatagggaaacttccacaacacttgacgcaaagggcccagaatgccttggacaacctggttatgttgcacaactccgaacaaggaggaagagcctacagacatgcaatggaagaacttccagacaccatagaaacgttgatgctcctagctttgatagctgtgttaactggtggagtgacgctgttcttcctatcaggaaagggcctagggaaaacatctattggcctactctgcgtgatggcttcaagcgtactgctatggatggccagcgtggagcctcattggatagcggcctccatcatactagagtttttcctgatggtgctgcttattccagagccagacagacagcgcactccacaggacaaccagttagcatatgtggtgataggtttgttattcatgatactcacagtggcagccaatgagatgggattattggaaaccacaaagaaagacttagggattggccatgtagccgccgaaaaccaccaccatgctacaatgctggacgtagacctacgtccagcttcagcctggaccctctatgcagtagccacaacagttatcacccccatgatgagacacacaattgaaaatacaacggcaaatatttccctgacagccattgcaaaccaggcagctatattgatgggacttgataaaggatggccaatatcgaagatggacataggagttccacttctcgccttggggtgctattcccaggtgaatccactgacgctgacagcggcggtattgatgctagtggctcattacgccataattggacctggactgcaagcaaaagcgactagagaagctcaaaaaaggacagcggccggaataatgaaaaatccaaccgttgatggaattgttgcaatagatttggaccctgtggtttatgatgcaaaatttgagaaacaactaggccaaataatgttgttgatactatgcacatcacagatcctcttgatgcggactacatgggccttgtgtgaatccatcacactggccactggacctctgaccacgctttgggagggatctccaggaaaattttggaacaccacgatagcggtttccatggcaaacattttcaggggaagttatctagcaggagcaggcctggccttctcattaatgaaatctctaggaggaggtaggagaggtacgggagccaaggggaaacactgggagagaaatggaaaagacagactgaaccaactgagcaagtcagaattcaacacttacaaaaggagtgggattatggaagtggacagatccgaagccaaagagggactgaaaagaggagaaacaaccaaacatgcagtgtcgagaggaaccgccaaattgaggtggttcgtggagaggaaccttgtgaaaccagaagggaaagtcatagacctcggttgtggaagaggtggctggtcatactattgcgctgggctgaagaaagtcacagaagtgaagggatacacaaaaggaggacctggacatgaggaaccaatcccaatggcgacctatggatggaacctagtaaagctatactccgggaaagacgtattctttacaccacctgagaagtgtgacacccttttgtgtgatattggtgagtcctctccaaacccaactatagaagaaggaagaacgttacgcgtcctaaagatggtggaaccatggctcagagggaaccaattttgcataaaaattctaaatccctacatgccaagtgtggtggaaactctggagcaaatgcaaagaaaacatggaggaatgctagtgcggaatccactttcaagaaattctactcatgaaatgtattgggtttcatgtggaacaggaaacattgtgtcagcagtaaacatgacatctagaatgttgctaaatcgattcacaatggctcacaggaaaccaacatatgaaagagacgtggacttaggcgctggaacaagacatgtggcagtggaaccagaggtagccaacctagatatcattggccagaggatagagaacataaaacatgaacataagtcaacatggcattatgatgaggacaatccatataaaacatgggcctatcatggatcatatgaggtcaagccatcaggatcagcctcatccatggtcaatggcgtggtgaaactgctcaccaaaccatgggatgccatccccatggtcacacaaatagccatgactgacaccacaccctttggacaacagagggtgtttaaagagaaagttgacacgcgcacaccaaaagcaaaacgaggcacagcacaaatcatggaggtgacagccaggtggttatggggttttctctctagaaacaaaaaaccaagaatttgtacaagagaggagttcacaagaaaagttaggtcaaacgcagccattggagcagtgttcgttgatgaaaatcaatggaactcagcaaaagaagcagtggaagatgagcggttctgggaccttgtgcacagagagagggagcttcacaaacagggaaaatgtgccacgtgtgtttacaacatgatggggaagagagagaaaaaactaggagagttcggaaaggcaaaaggaagtcgtgcaatatggtacatgtggttgggagcacgctttctagagttcgaagctcttggtttcatgaacgaagatcactggttcagtagagagaattcactcagtggagtggaaggagaaggactccacaaactcggatatatactcagagacatatcaaagattccagggggaaatatgtatgcagatgacacagccggatgggatacaaggataacagaggatgatcttcagaatgaggccaaaattactgacatcatggagcccgaacatgccctactggctacgtcaatcttcaagctgacctaccaaaataaggtggtaagggtacagagaccagcgaaaaatggaaccgtgatggatgtcatatccagacgtgaccagagaggaagtggccaggtcggaacttatggcttaaacactttcactaacatggaagcccagctaataagacaaatggagtctgagggaatcttttcacccagcgaattggagaccccaaatttagccgagagagttctcgactggctggaaaaatatggcgtcgaaaggctgaaaagaatggcaatcagcggagatgactgcgtggtgaaaccaattgatgacaggttcgcaacagccttaacagctctgaatgatatgggaaaagtaagaaaagatataccacaatgggaaccctcaaaaggatggaatgattggcaacaggtgcctttttgttcacaccatttccaccagctgattatgaaggatgggagggaaatagtggtgccatgccgcaaccaagatgaacttgtgggtagggctagagtatcacaaggtgctggatggagcctgagagaaactgcatgcctaggcaagtcatatgcacaaatgtggcagctgatgtacttccacaggagagacctgagactagctgctaatgctatctgttcagccgttccagttgattgggtcccaaccagccgcaccacttggtcgatccatgcccatcaccaatggatgacaacagaagacatgttgtcagtgtggaatagggt…'
               RetrieveURL: 'http://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db=EMBL&id=M87512&style=raw'

seqdata = struct with fields:
            Identification: [1×1 struct]
                 Accession: 'M87512'
           SequenceVersion: 'M87512.1'
               DateCreated: '03-JUN-1992 (Rel. 32, Created)'
               DateUpdated: '04-MAR-2000 (Rel. 63, Last updated, Version 3)'
               Description: 'Dengue virus type 1 complete genome.                                       '
                   Keyword: 'complete genome.                                                           '
           OrganismSpecies: 'Dengue virus 1                                                             '
    OrganismClassification: 'Viruses; Riboviria; Flaviviridae; Flavivirus.                              '
                 Organelle: ''
                 Reference: {[1×1 struct]}
    DatabaseCrossReference: [56×75 char]
                  Comments: [2×75 char]
                  Assembly: ''
                   Feature: [7×75 char]
                 BaseCount: [1×1 struct]
                  Sequence: 'gtggaccgcaaagaacagtttcgaatcggaagcttgcttaacgtagttctaacagttttttattagagagcagatctctgatgaacaaccaacgaaaaaagacggctcgaccgtctttcaatatgctgaaacgcgcgagaaaccgcgtgtcaactggttcacagttggcgaagagattctcaaaaggattgctttcaggccaaggacccatgaaattggtgatggctttcatagcattcctaagatttctagccatacccccaacagcaggaattttggctagatggggctcattcaagaagaatggagcgatcaaagtgctacggggtttcaagaaagaaatctcaaacatgttgaacataatgaatagaaggaaaagatctgtgaccatgctcctcatgctgctgcccacagccttggcgttccatttgactacacgagggggagagccacacatgatagttagcaagcaggaaagagaaaagtcactcttgtttaagacctctgtaggtgtcaacatgtgcacccttatagcgatggatttgggagagttatgtgaggacacaatgacttacaaatgccctcgaattactgaggcggaaccagatgacgttgattgttggtgcaatgctacagacacatgggtgacctatggaacatgttcccaaactggcgagcaccgacgggacaaacgttccgtcgcactggccccacacgtgggacttggtctagaaacaagaaccgaaacgtggatgtcctctgaaggcgcttggaaacaaatacaaagagtggagacttgggctttgcgacacccaggattcacggtgatagccctttttcttgcacatgccataggaacatccatcactcagaaagggattattttcattttgttaatgctagtaacaccatccatggccatgcgatgcgtgggaataggcagcagggacttcgtggaaggactatcaggagcaacttgggtagacgtggtactggaacatggaagttgcgtcaccaccatggcaaaagacaaaccaacattggacattgaactcctgaaaacggaggtcacgaaccctgccgtcctgcgcaaactgtgcattgaagctaaaatatcaaacaccaccaccgattcaagatgtccaacacaaggagaagctacactggtggaagaacaagacgcgaactttgtgtgtcgacgaacgttcgtggacagaggctggggtaatggctgcggactatttggaaaaggaagcctactgacgtgtgctaagttcaagtgtgtgacaaaactagaaggaaagatagttcaatatgaaaacttaaaatattcagtgatagtcactgtccacactggggaccagcaccaggtgggaaacgagactacagaacatggaacaattgcaaccataacacctcaagctcctacgtcggaaatacagctgaccgactacggagccctcacattggactgctcacctagaactgggctggactttaatgagatggtgctattgacaatgaaagaaaaatcatggcttgttcacaaacaatggtttctagacttaccactgccttggacttcgggggcttcaacatcccaagagacttggaacagacaagatttgctggtcacattcaagacagctcatgcaaagaagcaggaagtagtcgtactgggatcacaggaaggagcaatgcacactgcgttgactggggcgacagaaatccaaacgtctggaacgacaacaatttttgcaggacacctgaaatgtagactaaaaatggacaaactgactctaaaagggatgtcatatgtgatgtgcacaggctcatttaagctagagaaggaagtggctgagacccagcatggaactgttttagtgcaggttaaatacgaaggaacagatgcaccatgcaagatccccttttcgacccaagatgagaaaggagtgacccagaatagattgataacagccaatcctatagttactgacaaagaaaaaccagtcaacattgagacagaaccaccttttggtgagagctacatcgtggtaggggcaggtgaaaaagctttgaaacaatgctggttcaagaaaggaagcagcatagggaaaatgttcgaagcaaccgcccgaggagcacgaaggatggctatcctgggagacaccgcatgggacttcggttctataggaggagtgttcacgtctgtgggaaaattagtgcatcaggtttttggaaccgcatatggggttctgttcagcggtgtttcttggaccatgaaaataggaatagggattctgctgacatggttgggattaaattcaaggagcacgtcactttcgatgacgtgcattgcagttggcatggtcacactgtacctaggagtcatggttcaagcggactcgggatgtgtaatcaactggaagggcagagaactcaaatgtggaagtggcatttttgtcactaatgaagtccacacttggacagagcaatacaaatttcaagctgactccccaaaaagactatcagcagccatcggaaaggcatgggaggagggtgtgtgtggaattcgatcagccactcgtctcgagaacatcatgtggaagcaaatatcaaatgaactgaaccacatcttacttgaaaatgacatgaaattcacagtggttgtaggagatgttgttgggatcttggcccaagggaaaaaaatgattagaccacaacccatggaacacaaatactcatggaaaagctggggaaaagccaaaatcataggagcagacatacagaacaccaccttcatcattgacggcccagatactccagaatgtcctgatgaccaaagagcatggaacatttgggaagttgaggactatgggttcggaattttcacgacaaacatatggttgaaattgcgtgactcctacacccaaatgtgtgaccaccggctaatgtcagctgccatcaaggacagcaaggcagtccatgctgatatggggtactggatagaaagtgaaaagaacgagacctggaagctggcaagagcctctttcatagaagttaaaacatgtgtctggccaaaatcccacactctatggagcaatggagttctggaaagtgaaatgataattccaaagatctatggaggaccaatatctcagcacaactacagaccaggatatttcacacaaacggcagggccatggcacctaggcaagttggaactggattttgatttgtgtgagggtaccacagttgttgtggatgaacattgtggaaatcgaggtccatctcttagaaccacaacagtcacaggaaagataattcatgaatggtgttgcagatcttgtacgctaccacccttacgtttcaaaggagaagatggatgttggtacggtatggaaatcagaccagtcaaggaaaaggaagagaatctagtcaaatcaatggtctctgcagggtcaggggaagtggacagcttttcactaggactgctatgcatatcaataatgatcgaagaggtgatgagatccagatggagcagaaaaatgctgatgactggaacactggctgtgttcctccttctcataatgggacaattgacatggaatgatctgatcaggttatgcatcatggttggagccaatgcttcagacaggatggggatgggaacaacgtacctagctctgatggccacttttaaaatgagaccaatgtttgctgtcgggctgttgttccgcagactaacatctagagaagttcttcttcttacaattggattgagtctagtggcatctgtggagttaccaaattccctggaggagctgggggatggacttgcaatgggcattatgattttaaaattattgactgactttcagtcacatcagctgtgggctaccttgctgtccttgacatttgtcaaaacaacgttttccttgcactatgcatggaagacaatggctatggtactgtcaattgtatctctcttccccttatgcctgtccacgacctcccaaaaaacaacatggcttccggtgctattgggatctcttggatgcaaaccactaaccatgtttctcatagcagaaaacaaaatctggggaaggaaaagttggcccctcaatgaaggaatcatggctgttggaatagtcagcatcctactaagttcactcctcaaaaatgatgtgccgctagctgggccactaatagctggaggcatgctaatagcatgttacgttatatctggaagctcagccgacttatcactagagaaagcggctgaggtctcctgggaagaagaagcagaacactctggtgcctcacacaatatattagtggaggtccaagatgatggaaccatgaagataaaagatgaagagagagatgacacgctaaccattctccttaaagcaaccctgctagcagtttcaggggtgtacccattatcaataccagcaaccctttttgtgtggtacttttggcagaaaaagaaacaaagatctggagtgttatgggacacacctagccctccagaagtggaaagagcagtccttgatgatggtatctatagaattatgcagagaggactgttgggcaggtcccaagtaggagtgggagttttccaagacggcgtgttccacacaatgtggcacgtcaccaggggagctgtccttatgtaccaagggaagaggctggaaccaagctgggccagtgtcaaaaaagacttgatctcatatggaggaggttggaggtttcaaggatcctggaacacgggagaagaagtgcaggtgattgctgttgaaccaggaaaaaaccccaaaaatgtacagacagcgccgggtaccttcaagacccctgaaggtgaagttggagctattgccctagattttaaacccggcacatctggatctcccatcgtgaacagagaaggaaaaatagtaggtctttatggaaatggagtagtgacaacaagtggaacctacgtcagtgccatagcccaagccaaagcatcacaagaagggcccctaccagagattgaggacgaggtgtttaggaaaagaaacttaacaataatggacctacatccaggatcggggaaaacaagaagatatcttccagccatagtccgtgaggccataagaaggaacgtgcgcacactaattttggctcccacaagggttgtcgcttccgaaatggcagaggcgctcaagggaatgccaataaggtaccaaacaacagcagtgaagagtgaacacacaggaaaagagatagttgacctcatgtgtcacgccactttcaccatgcgtctcctgtctcccgtgagagttcccaattacaacatgattatcatggatgaagcacattttaccgatccagccagcatagcgcgcagagggtacatctcaacccgagtgggcatgggtgaagcagctgcgatcttcatgacagccactcccccaggatcggtggaggcctttccacagagcaatgcagttatccaagatgaggaaagagacattcctgagagatcatggaactcaggctatgagtggatcactgacttcccaggtaaaacagtctggtttgttccaagcatcaaatcaggaaatgacattgccaactgcttaagaaagaatgggaaacgggtgattcaattgagcaggaaaacctttgatacagagtaccaaaaaacaaaaaacaacgactgggactatgtcgtcacaacagatatctccgaaatgggagcaaacttccgagccgacagggtgatagacccaagacggtgtctgaaaccggtaatactaaaagatggtccagagcgcgtcattctagccggaccgatgccagtgactgtggccagtgctgcccagaggagaggaagaattggaaggaaccaaaacaaagaaggtgatcagtacgtttacatgggacagcctttaaataatgatgaggatcacgctcattggacagaagcaaaaatgctccttgacaatataaacacaccagaagggatcatcccagccctctttgagccagagagagaaaagagtgcagcaatagacggggagtacagactgcggggagaagcaagaaaaacgtttgtggagctcatgagaagaggagatctacctgtctggctatcctacaaagttgcctcagaaggcttccagtactctgacagaagatggtgctttgacggggaaaggaacaaccaggtgttggaggagaacatggacgtggagatgtggacaaaagaaggagaacgaaagaaactacgaccccgctggctggatgccagaacatactcagacccactggccctgcgcgagtttaaagagtttgcagcaggaagaagaagtgtctcaggtgatctaatattagaaatagggaaacttccacaacacttgacgcaaagggcccagaatgccttggacaacctggttatgttgcacaactccgaacaaggaggaagagcctacagacatgcaatggaagaacttccagacaccatagaaacgttgatgctcctagctttgatagctgtgttaactggtggagtgacgctgttcttcctatcaggaaagggcctagggaaaacatctattggcctactctgcgtgatggcttcaagcgtactgctatggatggccagcgtggagcctcattggatagcggcctccatcatactagagtttttcctgatggtgctgcttattccagagccagacagacagcgcactccacaggacaaccagttagcatatgtggtgataggtttgttattcatgatactcacagtggcagccaatgagatgggattattggaaaccacaaagaaagacttagggattggccatgtagccgccgaaaaccaccaccatgctacaatgctggacgtagacctacgtccagcttcagcctggaccctctatgcagtagccacaacagttatcacccccatgatgagacacacaattgaaaatacaacggcaaatatttccctgacagccattgcaaaccaggcagctatattgatgggacttgataaaggatggccaatatcgaagatggacataggagttccacttctcgccttggggtgctattcccaggtgaatccactgacgctgacagcggcggtattgatgctagtggctcattacgccataattggacctggactgcaagcaaaagcgactagagaagctcaaaaaaggacagcggccggaataatgaaaaatccaaccgttgatggaattgttgcaatagatttggaccctgtggtttatgatgcaaaatttgagaaacaactaggccaaataatgttgttgatactatgcacatcacagatcctcttgatgcggactacatgggccttgtgtgaatccatcacactggccactggacctctgaccacgctttgggagggatctccaggaaaattttggaacaccacgatagcggtttccatggcaaacattttcaggggaagttatctagcaggagcaggcctggccttctcattaatgaaatctctaggaggaggtaggagaggtacgggagccaaggggaaacactgggagagaaatggaaaagacagactgaaccaactgagcaagtcagaattcaacacttacaaaaggagtgggattatggaagtggacagatccgaagccaaagagggactgaaaagaggagaaacaaccaaacatgcagtgtcgagaggaaccgccaaattgaggtggttcgtggagaggaaccttgtgaaaccagaagggaaagtcatagacctcggttgtggaagaggtggctggtcatactattgcgctgggctgaagaaagtcacagaagtgaagggatacacaaaaggaggacctggacatgaggaaccaatcccaatggcgacctatggatggaacctagtaaagctatactccgggaaagacgtattctttacaccacctgagaagtgtgacacccttttgtgtgatattggtgagtcctctccaaacccaactatagaagaaggaagaacgttacgcgtcctaaagatggtggaaccatggctcagagggaaccaattttgcataaaaattctaaatccctacatgccaagtgtggtggaaactctggagcaaatgcaaagaaaacatggaggaatgctagtgcggaatccactttcaagaaattctactcatgaaatgtattgggtttcatgtggaacaggaaacattgtgtcagcagtaaacatgacatctagaatgttgctaaatcgattcacaatggctcacaggaaaccaacatatgaaagagacgtggacttaggcgctggaacaagacatgtggcagtggaaccagaggtagccaacctagatatcattggccagaggatagagaacataaaacatgaacataagtcaacatggcattatgatgaggacaatccatataaaacatgggcctatcatggatcatatgaggtcaagccatcaggatcagcctcatccatggtcaatggcgtggtgaaactgctcaccaaaccatgggatgccatccccatggtcacacaaatagccatgactgacaccacaccctttggacaacagagggtgtttaaagagaaagttgacacgcgcacaccaaaagcaaaacgaggcacagcacaaatcatggaggtgacagccaggtggttatggggttttctctctagaaacaaaaaaccaagaatttgtacaagagaggagttcacaagaaaagttaggtcaaacgcagccattggagcagtgttcgttgatgaaaatcaatggaactcagcaaaagaagcagtggaagatgagcggttctgggaccttgtgcacagagagagggagcttcacaaacagggaaaatgtgccacgtgtgtttacaacatgatggggaagagagagaaaaaactaggagagttcggaaaggcaaaaggaagtcgtgcaatatggtacatgtggttgggagcacgctttctagagttcgaagctcttggtttcatgaacgaagatcactggttcagtagagagaattcactcagtggagtggaaggagaaggactccacaaactcggatatatactcagagacatatcaaagattccagggggaaatatgtatgcagatgacacagccggatgggatacaaggataacagaggatgatcttcagaatgaggccaaaattactgacatcatggagcccgaacatgccctactggctacgtcaatcttcaagctgacctaccaaaataaggtggtaagggtacagagaccagcgaaaaatggaaccgtgatggatgtcatatccagacgtgaccagagaggaagtggccaggtcggaacttatggcttaaacactttcactaacatggaagcccagctaataagacaaatggagtctgagggaatcttttcacccagcgaattggagaccccaaatttagccgagagagttctcgactggctggaaaaatatggcgtcgaaaggctgaaaagaatggcaatcagcggagatgactgcgtggtgaaaccaattgatgacaggttcgcaacagccttaacagctctgaatgatatgggaaaagtaagaaaagatataccacaatgggaaccctcaaaaggatggaatgattggcaacaggtgcctttttgttcacaccatttccaccagctgattatgaaggatgggagggaaatagtggtgccatgccgcaaccaagatgaacttgtgggtagggctagagtatcacaaggtgctggatggagcctgagagaaactgcatgcctaggcaagtcatatgcacaaatgtggcagctgatgtacttccacaggagagacctgagactagctgctaatgctatctgttcagccgttccagttgattgggtcccaaccagccgcaccacttggtcgatccatgcccatcaccaatggatgacaacagaagacatgttgtcagtgtggaatagggt…'

Task 2 

Using Dengue Virus from the previous example, write a MATLAB script that can execute following tasks:

Calculate nucleotide count of the Dengue virus genome sequence.

Identify the length of the nucleotide

Visualize the nucleotide count.

Visualize ORF of nucleotide sequence. (https://www.mathworks.com/help/bioinfo/examples/calculating-and-visualizing-sequence-statistics.html)

Identify the three highest codon numbers in the nucleotide.

Look up the amino acids for codons in (5).

Convert a nucleotide sequence to an amino acid sequence

dengue_virus_gbk = getgenbank('AY145123');
%Calculate nucleotide count of the Dengue virus genome sequence.
dengue_virus = dengue_virus_gbk.Sequence;
%Identify the length of the nucleotide
dengue_virus_length = length(dengue_virus)
basecount(dengue_virus)


%Visualize the nucleotide count.
figure 
basecount(dengue_virus,'chart','pie');
title('Distribution of Nucleotide Bases for Dengue Virus');

figure
dimers = dimercount(dengue_virus,'chart','bar')
title('Dengue Virus Genome Dimer Histogram');
Map = geneticcode

for frame = 1:3 figure 
    subplot(2,1,1)
    codoncount(dengue_virus,'frame',frame,'figure',true,'geneticcode','none')
end
%Visualize ORF of nucleotide sequence.
seqshoworfs(dengue_virus);
orfs = seqshoworfs(dengue_virus,'GeneticCode','Standard','AlternativeStartCodons',true)

%Identify the three highest codon numbers in the nucleotide.
gbkStruct = getgenbank('AY145123')
CDS = featureparse(gbkStruct,'feature','cds')
CDS
coding_sequences = features.CDS
DVCDS = coding_sequences(1)

fseq=fastaread('sequence_ncbi.fasta')
[h,l] = featureview(gbkStruct,{'CDS','tRNA','rRNA','D_loop'},...
                                      [2 1 2 2 2],'Fontsize',9);
legend(h,l,'interpreter','none');
title('Dengue Virus 1, complete genome')

figure
subplot(2,1,1)
DVaaCount = aacount(fseq,'chart','bar');
title('Histogram of Amino Acid Count for the Dengue Viral Protein');

DV = nt2aa(fseq,'GeneticCode','Standard');
disp(seqdisp(DV))
%Identify the three highest codon numbers in the nucleotide.

codons = codoncount(fseq)
%Look up the amino acids for codons in (5)
figure
count = codoncount(fseq,'figure',true);
title('Dengue Virus Genome Codon Frequency')
%Convert a nucleotide sequence to an amino acid sequence

DVSeq = nt2aa(fseq,'geneticcode','Standard')
DVprotein = getgenpept('AAN06983','sequenceonly',true)
aacount(DVSeq, 'chart','bar')


denguepro = getgenpept('AAN06983')
dengueproAC = atomiccomp(denguepro)

dengueproAC.C
dengueproMW = molweight(denguepro)

Output

dengue_virus_length = 10705

ans = struct with fields:
    A: 3419
    C: 2234
    G: 2759
    T: 2293

dimers = struct with fields:
    AA: 1106
    AC: 717
    AG: 888
    AT: 708
    CA: 899
    CC: 521
    CG: 260
    CT: 554
    GA: 975
    GC: 498
    GG: 782
    GT: 504
    TA: 438
    TC: 498
    TG: 829
    TT: 527

Map = struct with fields:
      Name: 'Standard'
       AAA: 'K'
       AAC: 'N'
       AAG: 'K'
       AAT: 'N'
       ACA: 'T'
       ACC: 'T'
       ACG: 'T'
       ACT: 'T'
       AGA: 'R'
       AGC: 'S'
       AGG: 'R'
       AGT: 'S'
       ATA: 'I'
       ATC: 'I'
       ATG: 'M'
       ATT: 'I'
       CAA: 'Q'
       CAC: 'H'
       CAG: 'Q'
       CAT: 'H'
       CCA: 'P'
       CCC: 'P'
       CCG: 'P'
       CCT: 'P'
       CGA: 'R'
       CGC: 'R'
       CGG: 'R'
       CGT: 'R'
       CTA: 'L'
       CTC: 'L'
       CTG: 'L'
       CTT: 'L'
       GAA: 'E'
       GAC: 'D'
       GAG: 'E'
       GAT: 'D'
       GCA: 'A'
       GCC: 'A'
       GCG: 'A'
       GCT: 'A'
       GGA: 'G'
       GGC: 'G'
       GGG: 'G'
       GGT: 'G'
       GTA: 'V'
       GTC: 'V'
       GTG: 'V'
       GTT: 'V'
       TAA: '*'
       TAC: 'Y'
       TAG: '*'
       TAT: 'Y'
       TCA: 'S'
       TCC: 'S'
       TCG: 'S'
       TCT: 'S'
       TGA: '*'
       TGC: 'C'
       TGG: 'W'
       TGT: 'C'
       TTA: 'L'
       TTC: 'F'
       TTG: 'L'
       TTT: 'F'
    Starts: {'ATG'  'CTG'  'TTG'}

AAA - 132     AAC -  91     AAG -  74     AAT - 104     
ACA -  66     ACC -  54     ACG -  20     ACT -  60     
AGA - 135     AGC - 114     AGG - 123     AGT -  84     
ATA -  24     ATC -  57     ATG -  61     ATT -  52     
CAA -  86     CAC - 101     CAG -  56     CAT - 108     
CCA -  65     CCC -  34     CCG -  14     CCT -  51     
CGA -  27     CGC -  23     CGG -  23     CGT -  31     
CTA -  23     CTC -  38     CTG -  40     CTT -  50     
GAA - 108     GAC -  68     GAG -  56     GAT -  73     
GCA -  29     GCC -  24     GCG -  13     GCT -  64     
GGA - 109     GGC -  66     GGG -  55     GGT -  60     
GTA -  19     GTC -  35     GTG -  37     GTT -  61     
TAA -  25     TAC -  15     TAG -  26     TAT -  40     
TCA -  33     TCC -  35     TCG -  10     TCT -  54     
TGA - 110     TGC -  51     TGG -  88     TGT -  61     
TTA -  14     TTC -  32     TTG -  26     TTT -  50     
AAA - 144     AAC -  79     AAG -  74     AAT -  54     
ACA - 121     ACC -  64     ACG -  40     ACT -  50     
AGA - 104     AGC -  34     AGG -  45     AGT -  29     
ATA -  93     ATC -  51     ATG - 126     ATT -  55     
CAA -  69     CAC -  40     CAG -  48     CAT -  36     
CCA -  72     CCC -  28     CCG -  20     CCT -  27     
CGA -  16     CGC -  13     CGG -  13     CGT -  15     
CTA -  79     CTC -  39     CTG -  74     CTT -  37     
GAA - 136     GAC -  89     GAG -  95     GAT -  61     
GCA -  86     GCC -  87     GCG -  25     GCT -  56     
GGA - 167     GGC -  36     GGG -  47     GGT -  39     
GTA -  37     GTC -  55     GTG -  96     GTT -  48     
TAA -   5     TAC -  34     TAG -   3     TAT -  38     
TCA -  76     TCC -  31     TCG -  14     TCT -  41     
TGA -   2     TGC -  31     TGG -  98     TGT -  33     
TTA -  43     TTC -  59     TTG -  59     TTT -  52     
AAA - 113     AAC -  64     AAG - 130     AAT -  47     
ACA - 105     ACC -  57     ACG -  33     ACT -  47     
AGA -  96     AGC -  35     AGG -  67     AGT -  22     
ATA -  24     ATC -  30     ATG - 105     ATT -  30     
CAA - 116     CAC -  57     CAG - 123     CAT -  59     
CCA -  99     CCC -  43     CCG -  26     CCT -  42     
CGA -  26     CGC -  14     CGG -  33     CGT -  26     
CTA -  29     CTC -  44     CTG -  70     CTT -  30     
GAA - 102     GAC -  42     GAG -  98     GAT -  47     
GCA -  53     GCC -  18     GCG -  20     GCT -  23     
GGA -  78     GGC -  30     GGG -  56     GGT -  39     
GTA -  16     GTC -  16     GTG -  58     GTT -  26     
TAA -  70     TAC -  37     TAG - 104     TAT -  41     
TCA -  94     TCC -  46     TCG -  25     TCT -  39     
TGA - 105     TGC -  51     TGG - 134     TGT -  65     
TTA -  37     TTC -  42     TTG -  77     TTT -  36     

orfs = 1×3 struct
FieldsStartStop
11×46 double1×46 double
2[92,10607]10271
31×64 double1×63 double

gbkStruct = struct with fields:
                LocusName: 'AY145123'
      LocusSequenceLength: '10705'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: 'RNA'
     LocusGenBankDivision: 'VRL'
    LocusModificationDate: '30-DEC-2002'
               Definition: 'Dengue virus type 1 recombinant clone rDEN1delta30, complete genome.'
                Accession: 'AY145123'
                  Version: 'AY145123.1'
                       GI: ''
                  Project: []
                   DBLink: []
                 Keywords: []
                  Segment: []
                   Source: 'Dengue virus 1'
           SourceOrganism: [3×67 char]
                Reference: {[1×1 struct]  [1×1 struct]}
                  Comment: []
                 Features: [101×74 char]
                      CDS: [1×1 struct]
                 Sequence: 'agttgttagtctacgtggaccgacaagaacagtttcgaatcggaagcttgcttaacgtagttctaacagttttttattagagagcagatctctgatgaacaaccaacggaaaaagacgggtcgaccgtctttcaatatgctgaaacgcgcgagaaaccgcgtgtcaactgtttcacagttggcgaagagattctcaaaaggattgctttcaggccaaggacccatgaaattggtgatggcttttatagcattcctaagatttctagccatacctccaacagcaggaattttggctagatggggctcattcaagaagaatggagcgatcaaagtgttacggggtttcaagaaagaaatctcaaacatgttgaacataatgaacaggaggaaaagatctgtgaccatgctcctcatgctgctgcccacagccctggcgttccatctgaccacccgagggggagagccgcacatgatagttagcaagcaggaaagaggaaaatcacttttgtttaagacctctgcaggtgtcaacatgtgcacccttattgcaatggatttgggagagttatgtgaggacacaatgacctacaaatgcccccggatcactgagacggaaccagatgacgttgactgttggtgcaatgccacggagacatgggtgacctatggaacatgttctcaaactggtgaacaccgacgagacaaacgttccgtcgcactggcaccacacgtagggcttggtctagaaacaagaaccgaaacgtggatgtcctctgaaggcgcttggaaacaaatacaaaaagtggagacctgggctctgagacacccaggattcacggtgatagccctttttctagcacatgccataggaacatccatcacccagaaagggatcatttttattttgctgatgctggtaactccatccatggccatgcggtgcgtgggaataggcaacagagacttcgtggaaggactgtcaggagctacgtgggtggatgtggtactggagcatggaagttgcgtcactaccatggcaaaagacaaaccaacactggacattgaactcttgaagacggaggtcacaaaccctgccgtcctgcgcaaactgtgcattgaagctaaaatatcaaacaccaccaccgattcgagatgtccaacacaaggagaagccacgctggtggaagaacaggacacgaactttgtgtgtcgacgaacgttcgtggacagaggctggggcaatggttgtgggctattcggaaaaggtagcttaataacgtgtgctaagtttaagtgtgtgacaaaactggaaggaaagatagtccaatatgaaaacttaaaatattcagtgatagtcaccgtacacactggagaccagcaccaagttggaaatgagaccacagaacatggaacaactgcaaccataacacctcaagctcccacgtcggaaatacagctgacagactacggagctctaacattggattgttcacctagaacagggctagactttaatgagatggtgttgttgacaatggaaaaaaaatcatggctcgtccacaaacaatggtttctagacttaccactgccttggacctcgggggcttcaacatcccaagagacttggaatagacaagacttgctggtcacatttaagacagctcatgcaaaaaagcaggaagtagtcgtactaggatcacaagaaggagcaatgcacactgcgttgactggagcgacagaaatccaatcgtctggaacgacaacaatttttgcaggacacctgaaatgcagactaaaaatggataaactgactttaaaagggatgtcatatgtaatgtgcacagggtcattcaagttagagaaggaagtggctgagacccagcatggaactgttctagtgcaggttaaatacgaaggaacagatgcaccatgcaagatccccttctcgtcccaagatgagaagggagtaacccagaatgggagattgataacagccaaccccatagtcactgacaaagaaaaaccagtcaacattgaagcggagccaccttttggtgagagctacattgtggtaggagcaggtgaaaaagctttgaaactaagctggttcaagaagggaagcagtatagggaaaatgtttgaagcaactgcccgtggagcacgaaggatggccatcctgggagacactgcatgggacttcggttctataggaggggtgttcacgtctgtgggaaaactgatacaccagatttttgggactgcgtatggagttttgttcagcggtgtttcttggaccatgaagataggaatagggattctgctgacatggctaggattaaactcaaggagcacgtccctttcaatgacgtgtatcgcagttggcatggtcacactgtacctaggagtcatggttcaggcggactcgggatgtgtaatcaactggaaaggcagagaactcaaatgtggaagcggcatttttgtcaccaatgaagtccacacctggacagagcaatataaattccaggccgactcccctaagagactatcagcggccattgggaaggcatgggaggagggtgtgtgtggaattcgatcagccactcgtctcgagaacatcatgtggaagcaaatatcaaatgaattaaaccacatcttacttgaaaatgacatgaaatttacagtggtcgtaggagacgttagtggaatcttggcccaaggaaagaaaatgattaggccacaacccatggaacacaaatactcgtggaaaagctggggaaaagccaaaatcataggagcagatgtacagaataccaccttcatcatcgacggcccaaacaccccagaatgccctgataaccaaagagcatggaacatttgggaagttgaagactatggatttggaattttcacgacaaacatatggttgaaattgcgtgactcctacactcaagtgtgtgaccaccggctaatgtcagctgccatcaaggatagcaaagcagtccatgctgacatggggtactggatagaaagtgaaaagaacgagacttggaagttggcaagagcctccttcatagaagttaagacatgcatctggccaaaatcccacactctatggagcaatggagtcctggaaagtgagatgataatcccaaagatatatggaggaccaatatctcagcacaactacagaccaggatatttcacacaaacagcagggccgtggcacttgggcaagttagaactagattttgatttatgtgaaggtaccactgttgttgtggatgaacattgtggaaatcgaggaccatctcttagaaccacaacagtcacaggaaagacaatccatgaatggtgctgtagatcttgcacgttaccccccctacgtttcaaaggagaagacgggtgctggtacggcatggaaatcagaccagtcaaggagaaggaagagaacctagttaagtcaatggtctctgcagggtcaggagaagtggacagtttttcactaggactgctatgcatatcaataatgatcgaagaggtaatgagatccagatggagcagaaaaatgctgatgactggaacattggctgtgttcctccttctcacaatgggacaattgacatggaatgatctgatcaggctatgtatcatggttggagccaacgcttcagacaagatggggatgggaacaacgtacctagctttgatggccactttcagaatgagaccaatgttcgcagtcgggctactgtttcgcagattaacatctagagaagttcttcttcttacagttggattgagtctggtggcatctgtagaactaccaaattccttagaggagctaggggatggacttgcaatgggcatcatgatgttgaaattactgactgattttcagtcacatcagctatgggctaccttgctgtctttaacatttgtcaaaacaactttttcattgcactatgcatggaagacaatggctatgatactgtcaattgtatctctcttccctttatgcctgtccacgacttctcaaaaaacaacatggcttccggtgttgctgggatctcttggatgcaaaccactaaccatgtttcttataacagaaaacaaaatctggggaaggaaaagctggcctctcaatgaaggaattatggctgttggaatagttagcattcttctaagttcacttctcaagaatgatgtgccactagctggcccactaatagctggaggcatgctaatagcatgttatgtcatatctggaagctcggccgatttatcactggagaaagcggctgaggtctcctgggaagaagaagcagaacactctggtgcctcacacaacatactagtggaggtccaagatgatggaaccatgaaaataaaggatgaagagagagatgacacactcaccattctcctcaaagcaactctgctagcaatctcaggggtatacccaatgtcaataccggcgaccctctttgtgtggtatttttggcagaaaaagaaacagagatcaggagtgctatgggacacacccagccctccagaagtggaaagagcagtccttgatgatggcatttatagaattctccaaagaggattgttgggcaggtctcaagtaggagtaggagtttttcaagaaggcgtgttccacacaatgtggcacgtcaccaggggagctgtcctcatgtaccaagggaagagactggaaccaagttgggccagtgtcaaaaaagacttgatctcatatggaggaggttggaggtttcaaggatcctggaacgcgggagaagaagtgcaggtgattgctgttgaaccggggaagaaccccaaaaatgtacagacagcgccgggtaccttcaagacccctgaaggcgaagttggagccatagctctagactttaaacccggcacatctggatctcctatcgtgaacagagagggaaaaatagtaggtctttatggaaatggagtggtgacaacaagtggtacctacgtcagtgccatagctcaagctaaagcatcacaagaagggcctctaccagagattgaggacgaggtgtttaggaaaagaaacttaacaataatggacctacatccaggatcgggaaaaacaagaagataccttccagccatagtccgtgaggccataaaaagaaagctgcgcacgctagtcttagctcccacaagagttgtcgcttctgaaatggcagaggcgctcaagggaatgccaataaggtatcagacaacagcagtgaagagtgaacacacgggaaaggagatagttgaccttatgtgtcacgccactttcactatgcgtctcctgtctcctgtgagagttcccaattataatatgattatcatggatgaagcacatttcaccgatccagccagcatagcagccagagggtatatctcaacccgagtgggtatgggtgaagcagctgcgattttcatgacagccactccccccggatcggtggaggcctttccacagagcaatgcagttatccaagatgaggaaagagacattcctgaaagatcatggaactcaggctatgactggatcactgatttcccaggtaaaacagtctggtttgttccaagcatcaaatcaggaaatgacattgccaactgtttaagaaagaatgggaaacgggtggtccaattgagcagaaaaacttttgacactgagtaccagaaaacaaaaaataacgactgggactatgttgtcacaacagacatatccgaaatgggagcaaacttccgagccgacagggtaatagacccgaggcggtgcctgaaaccggtaatactaaaagatggcccagagcgtgtcattctagccggaccgatgccagtgactgtggctagcgccgcccagaggagaggaagaattggaaggaaccaaaataaggaaggcgatcagtatatttacatgggacagcctctaaaaaatgatgaggaccacgcccattggacagaagcaaaaatgctccttgacaacataaacacaccagaagggattatcccagccctctttgagccggagagagaaaagagtgcagcaatagacggggaatacagactacggggtgaagcgaggaaaacgttcgtggagctcatgagaagaggagatctacctgtctggctatcctacaaagttgcctcagaaggcttccagtactccgacagaaggtggtgctttgatggggaaaggaacaaccaggtgttggaggagaacatggacgtggagatctggacaaaagaaggagaaagaaagaaactacgaccccgctggctggatgccagaacatactctgacccactggctctgcgcgaattcaaagagttcgcagcaggaagaagaagcgtctcaggtgacctaatattagaaatagggaaacttccacaacatttaacgcaaagggcccagaacgccttggacaatctggttatgttgcacaactctgaacaaggaggaaaagcctatagacacgccatggaagaactaccagacaccatagaaacgttaatgctcctagctttgatagctgtgctgactggtggagtgacgttgttcttcctatcaggaaggggtctaggaaaaacatccattggcctactctgcgtgattgcctcaagtgcactgttatggatggccagtgtggaaccccattggatagcggcctctatcatactggagttctttctgatggtgttgcttattccagagccggacagacagcgcactccacaagacaaccagctagcatacgtggtgataggtctgttattcatgatattgacagtggcagccaatgagatgggattactggaaaccacaaagaaggacctggggattggtcatgcagctgctgaaaaccaccatcatgctgcaatgctggacgtagacctacatccagcttcagcctggactctctatgcagtggccacaacaattatcactcccatgatgagacacacaattgaaaacacaacggcaaatatttccctgacagctattgcaaaccaggcagctatattgatgggacttgacaagggatggccaatatcaaagatggacataggagttccacttctcgccttggggtgctattctcaggtgaacccgctgacgctgacagcggcggtatttatgctagtggctcattatgccataattggacccggactgcaagcaaaagctactagagaagctcaaaaaaggacagcagccggaataatgaaaaacccaactgtcgacgggatcgttgcaatagatttggaccctgtggtttacgatgcaaaatttgaaaaacagctaggccaaataatgttgttgatactttgcacatcacagatcctcctgatgcggaccacatgggccttgtgtgaatccatcacactagccactggacctctgaccacgctttgggagggatctccaggaaaattctggaacaccacgatagcggtgtccatggcaaacatttttaggggaagttatctagcaggagcaggtctggccttttcattaatgaaatctctaggaggaggtaggagaggcacgggagcccaaggggaaacactgggagaaaaatggaaaagacagctaaaccaattgagcaagtcagaattcaacacttacaaaaggagtgggattatagaggtggatagatctgaagccaaagaggggttaaaaagaggagaaacgactaaacacgcagtgtcgagaggaacggccaaactgaggtggtttgtggagaggaaccttgtgaaaccagaagggaaagtcatagacctcggttgtggaagaggtggctggtcatattattgcgctgggctgaagaaagtcacagaagtgaaaggatacacgaaaggaggacctggacatgaggaaccaatcccaatggcaacctatggatggaacctagtaaagctatactccgggaaagatgtattctttacaccacctgagaaatgtgacaccctcttgtgtgatattggtgagtcctctccgaacccaactatagaagaaggaagaacgttacgtgttctaaagatggtggaaccatggctcagaggaaaccaattttgcataaaaattctaaatccctatatgccgagtgtggtagaaactttggagcaaatgcaaagaaaacatggaggaatgctagtgcgaaatccactctcaagaaactccactcatgaaatgtactgggtttcatgtggaacaggaaacattgtgtcagcagtaaacatgacatctagaatgctgctaaatcgattcacaatggctcacaggaagccaacatatgaaagagacgtggacttaggcgctggaacaagacatgtggcagtagaaccagaggtggccaacctagatatcattggccagaggatagagaatataaaaaatgaacacaaatcaacatggcattatgatgaggacaatccatacaaaacatgggcctatcatggatcatatgaggtcaagccatcaggatcagcctcatccatggtcaatggtgtggtgagactgctaaccaaaccatgggatgtcattcccatggtcacacaaatagccatgactgacaccacaccctttggacaacagagggtgtttaaagagaaagttgacacgcgtacaccaaaagcgaaacgaggcacagcacaaattatggaggtgacagccaggtggttatggggttttctctctagaaacaaaaaacccagaatctgcacaagagaggagttcacaagaaaagtcaggtcaaacgcagctattggagcagtgttcgttgatgaaaatcaatggaactcagcaaaagaggcagtggaagatgaacggttctgggaccttgtgcacagagagagggagcttcataaacaaggaaaatgtgccacgtgtgtctacaacatgatgggaaagagagagaaaaaattaggagagttcggaaaggcaaaaggaagtcgcgcaatatggtacatgtggttgggagcgcgctttttagagtttgaagcccttggtttcatgaatgaagatcactggttcagcagagagaattcactcagtggagtggaaggagaaggactccacaaacttggatacatactcagagacatatcaaagattccagggggaaatatgtatgcagatgacacagccggatgggacacaagaataacagaggatgatcttcagaatgaggccaaaatcactgacatcatggaacctgaacatgccctattggccacgtcaatctttaagctaacctaccaaaacaaggtagtaagggtgcagagaccagcgaaaaatggaaccgtgatggatgtcatatccagacgtgaccagagaggaagtggacaggttggaacctatggcttaaacaccttcaccaacatggaggcccaactaataagacaaatggagtctgagggaatcttttcacccagcgaattggaaaccccaaatctagccgaaagagtcctcgactggttgaaaaaacatggcaccgagaggctgaaaagaatggcaatcagtggagatgactgtgtggtgaaaccaattgatgacagatttgcaacagccttaacagctttgaatgacatgggaaaggtaagaaaagacataccgcaatgggaaccttcaaaaggatggaatgattggcaacaagtgcctttctgttcacaccatttccaccagctgattatgaaggatgggagggagatagtggtgccatgccgcaaccaagatgaacttgtaggtagggccagagtatcacaaggcgccggatggagcttgagagaaactgcatgcctaggcaagtcatatgcacaaatgtggcagctgatgtacttccacaggagagacttgagattagcggctaatgctatctgttcagccgttccagttgattgggtcccaaccagccgtaccacctggtcgatccatgcccaccatcaatggatgacaacagaagacatgttg…'
                SearchURL: 'https://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=AY145123'
              RetrieveURL: 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=22901065&rettype=gb&retmode=text'

CDS = struct with fields:
       Location: '95..10273'
        Indices: [95 10273]
    codon_start: '1'
        product: 'polyprotein precursor'
     protein_id: 'AAN06983.1'
    translation: 'MNNQRKKTGRPSFNMLKRARNRVSTVSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKVLRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCEDTMTYKCPRITETEPDDVDCWCNATETWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETWALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPTLDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDTNFVCRRTFVDRGWGNGCGLFGKGSLITCAKFKCVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTTATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLTMEKKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQSSGTTTIFAGHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSSQDEKGVTQNGRLITANPIVTDKEKPVNIEAEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLIHQIFGTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEVHTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVSGILAQGKKMIRPQPMEHKYSWKSWGKAKIIGADVQNTTFIIDGPNTPECPDNQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQVCDHRLMSAAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTAGPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKTIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENLVKSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLTMGQLTWNDLIRLCIMVGANASDKMGMGTTYLALMATFRMRPMFAVGLLFRRLTSREVLLLTVGLSLVASVELPNSLEELGDGLAMGIMMLKLLTDFQSHQLWATLLSLTFVKTTFSLHYAWKTMAMILSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLITENKIWGRKSWPLNEGIMAVGIVSILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTLTILLKATLLAISGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQEGVFHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNAGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGAIALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYLPAIVREAIKRKLRTLVLAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMDEAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPSIKSGNDIANCLRKNGKRVVQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILAGPMPVTVASAAQRRGRIGRNQNKEGDQYIYMGQPLKNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDPLALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTGGVTLFFLSGRGLGKTSIGLLCVIASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILTVAANEMGLLETTKKDLGIGHAAAENHHHAAMLDVDLHPASAWTLYAVATTIITPMMRHTIENTTANISLTAIANQAAILMGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVFMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLDPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIIEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVERNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDIGESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNIVSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNEHKSTWHYDEDNPYKTWAYHGSYEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRNKKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQNEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGIFSPSELETPNLAERVLDWLKKHGTERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPFCSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVPTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDQWCGSLIGLTARATWATNIQVAINQVRRLIGNENYLDFMTSMKRFKNESDPEGALW'

CDS = struct with fields:
       Location: '95..10273'
        Indices: [95 10273]
    codon_start: '1'
        product: 'polyprotein precursor'
     protein_id: 'AAN06983.1'
    translation: 'MNNQRKKTGRPSFNMLKRARNRVSTVSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKVLRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCEDTMTYKCPRITETEPDDVDCWCNATETWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETWALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPTLDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDTNFVCRRTFVDRGWGNGCGLFGKGSLITCAKFKCVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTTATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLTMEKKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQSSGTTTIFAGHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSSQDEKGVTQNGRLITANPIVTDKEKPVNIEAEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLIHQIFGTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEVHTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVSGILAQGKKMIRPQPMEHKYSWKSWGKAKIIGADVQNTTFIIDGPNTPECPDNQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQVCDHRLMSAAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTAGPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKTIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENLVKSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLTMGQLTWNDLIRLCIMVGANASDKMGMGTTYLALMATFRMRPMFAVGLLFRRLTSREVLLLTVGLSLVASVELPNSLEELGDGLAMGIMMLKLLTDFQSHQLWATLLSLTFVKTTFSLHYAWKTMAMILSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLITENKIWGRKSWPLNEGIMAVGIVSILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTLTILLKATLLAISGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQEGVFHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNAGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGAIALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYLPAIVREAIKRKLRTLVLAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMDEAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPSIKSGNDIANCLRKNGKRVVQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILAGPMPVTVASAAQRRGRIGRNQNKEGDQYIYMGQPLKNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDPLALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTGGVTLFFLSGRGLGKTSIGLLCVIASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILTVAANEMGLLETTKKDLGIGHAAAENHHHAAMLDVDLHPASAWTLYAVATTIITPMMRHTIENTTANISLTAIANQAAILMGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVFMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLDPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIIEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVERNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDIGESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNIVSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNEHKSTWHYDEDNPYKTWAYHGSYEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRNKKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQNEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGIFSPSELETPNLAERVLDWLKKHGTERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPFCSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVPTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDQWCGSLIGLTARATWATNIQVAINQVRRLIGNENYLDFMTSMKRFKNESDPEGALW'

coding_sequences = struct with fields:
       Location: '95..10273'
        Indices: [95 10273]
    codon_start: '1'
        product: 'polyprotein precursor'
     protein_id: 'AAN06983.1'
    translation: 'MNNQRKKTGRPSFNMLKRARNRVSTVSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKVLRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCEDTMTYKCPRITETEPDDVDCWCNATETWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETWALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPTLDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDTNFVCRRTFVDRGWGNGCGLFGKGSLITCAKFKCVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTTATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLTMEKKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQSSGTTTIFAGHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSSQDEKGVTQNGRLITANPIVTDKEKPVNIEAEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLIHQIFGTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEVHTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVSGILAQGKKMIRPQPMEHKYSWKSWGKAKIIGADVQNTTFIIDGPNTPECPDNQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQVCDHRLMSAAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTAGPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKTIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENLVKSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLTMGQLTWNDLIRLCIMVGANASDKMGMGTTYLALMATFRMRPMFAVGLLFRRLTSREVLLLTVGLSLVASVELPNSLEELGDGLAMGIMMLKLLTDFQSHQLWATLLSLTFVKTTFSLHYAWKTMAMILSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLITENKIWGRKSWPLNEGIMAVGIVSILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTLTILLKATLLAISGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQEGVFHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNAGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGAIALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYLPAIVREAIKRKLRTLVLAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMDEAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPSIKSGNDIANCLRKNGKRVVQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILAGPMPVTVASAAQRRGRIGRNQNKEGDQYIYMGQPLKNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDPLALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTGGVTLFFLSGRGLGKTSIGLLCVIASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILTVAANEMGLLETTKKDLGIGHAAAENHHHAAMLDVDLHPASAWTLYAVATTIITPMMRHTIENTTANISLTAIANQAAILMGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVFMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLDPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIIEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVERNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDIGESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNIVSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNEHKSTWHYDEDNPYKTWAYHGSYEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRNKKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQNEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGIFSPSELETPNLAERVLDWLKKHGTERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPFCSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVPTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDQWCGSLIGLTARATWATNIQVAINQVRRLIGNENYLDFMTSMKRFKNESDPEGALW'

DVCDS = struct with fields:
       Location: '95..10273'
        Indices: [95 10273]
    codon_start: '1'
        product: 'polyprotein precursor'
     protein_id: 'AAN06983.1'
    translation: 'MNNQRKKTGRPSFNMLKRARNRVSTVSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKVLRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCEDTMTYKCPRITETEPDDVDCWCNATETWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETWALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPTLDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDTNFVCRRTFVDRGWGNGCGLFGKGSLITCAKFKCVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTTATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLTMEKKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQSSGTTTIFAGHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSSQDEKGVTQNGRLITANPIVTDKEKPVNIEAEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLIHQIFGTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEVHTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVSGILAQGKKMIRPQPMEHKYSWKSWGKAKIIGADVQNTTFIIDGPNTPECPDNQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQVCDHRLMSAAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTAGPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKTIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENLVKSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLTMGQLTWNDLIRLCIMVGANASDKMGMGTTYLALMATFRMRPMFAVGLLFRRLTSREVLLLTVGLSLVASVELPNSLEELGDGLAMGIMMLKLLTDFQSHQLWATLLSLTFVKTTFSLHYAWKTMAMILSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLITENKIWGRKSWPLNEGIMAVGIVSILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTLTILLKATLLAISGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQEGVFHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNAGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGAIALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYLPAIVREAIKRKLRTLVLAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMDEAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPSIKSGNDIANCLRKNGKRVVQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILAGPMPVTVASAAQRRGRIGRNQNKEGDQYIYMGQPLKNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDPLALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTGGVTLFFLSGRGLGKTSIGLLCVIASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILTVAANEMGLLETTKKDLGIGHAAAENHHHAAMLDVDLHPASAWTLYAVATTIITPMMRHTIENTTANISLTAIANQAAILMGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVFMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLDPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIIEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVERNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDIGESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNIVSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNEHKSTWHYDEDNPYKTWAYHGSYEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRNKKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQNEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGIFSPSELETPNLAERVLDWLKKHGTERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPFCSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVPTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDQWCGSLIGLTARATWATNIQVAINQVRRLIGNENYLDFMTSMKRFKNESDPEGALW'

fseq = struct with fields:
      Header: 'AY145123.1 Dengue virus type 1 recombinant clone rDEN1delta30, complete genome'
    Sequence: 'AGTTGTTAGTCTACGTGGACCGACAAGAACAGTTTCGAATCGGAAGCTTGCTTAACGTAGTTCTAACAGTTTTTTATTAGAGAGCAGATCTCTGATGAACAACCAACGGAAAAAGACGGGTCGACCGTCTTTCAATATGCTGAAACGCGCGAGAAACCGCGTGTCAACTGTTTCACAGTTGGCGAAGAGATTCTCAAAAGGATTGCTTTCAGGCCAAGGACCCATGAAATTGGTGATGGCTTTTATAGCATTCCTAAGATTTCTAGCCATACCTCCAACAGCAGGAATTTTGGCTAGATGGGGCTCATTCAAGAAGAATGGAGCGATCAAAGTGTTACGGGGTTTCAAGAAAGAAATCTCAAACATGTTGAACATAATGAACAGGAGGAAAAGATCTGTGACCATGCTCCTCATGCTGCTGCCCACAGCCCTGGCGTTCCATCTGACCACCCGAGGGGGAGAGCCGCACATGATAGTTAGCAAGCAGGAAAGAGGAAAATCACTTTTGTTTAAGACCTCTGCAGGTGTCAACATGTGCACCCTTATTGCAATGGATTTGGGAGAGTTATGTGAGGACACAATGACCTACAAATGCCCCCGGATCACTGAGACGGAACCAGATGACGTTGACTGTTGGTGCAATGCCACGGAGACATGGGTGACCTATGGAACATGTTCTCAAACTGGTGAACACCGACGAGACAAACGTTCCGTCGCACTGGCACCACACGTAGGGCTTGGTCTAGAAACAAGAACCGAAACGTGGATGTCCTCTGAAGGCGCTTGGAAACAAATACAAAAAGTGGAGACCTGGGCTCTGAGACACCCAGGATTCACGGTGATAGCCCTTTTTCTAGCACATGCCATAGGAACATCCATCACCCAGAAAGGGATCATTTTTATTTTGCTGATGCTGGTAACTCCATCCATGGCCATGCGGTGCGTGGGAATAGGCAACAGAGACTTCGTGGAAGGACTGTCAGGAGCTACGTGGGTGGATGTGGTACTGGAGCATGGAAGTTGCGTCACTACCATGGCAAAAGACAAACCAACACTGGACATTGAACTCTTGAAGACGGAGGTCACAAACCCTGCCGTCCTGCGCAAACTGTGCATTGAAGCTAAAATATCAAACACCACCACCGATTCGAGATGTCCAACACAAGGAGAAGCCACGCTGGTGGAAGAACAGGACACGAACTTTGTGTGTCGACGAACGTTCGTGGACAGAGGCTGGGGCAATGGTTGTGGGCTATTCGGAAAAGGTAGCTTAATAACGTGTGCTAAGTTTAAGTGTGTGACAAAACTGGAAGGAAAGATAGTCCAATATGAAAACTTAAAATATTCAGTGATAGTCACCGTACACACTGGAGACCAGCACCAAGTTGGAAATGAGACCACAGAACATGGAACAACTGCAACCATAACACCTCAAGCTCCCACGTCGGAAATACAGCTGACAGACTACGGAGCTCTAACATTGGATTGTTCACCTAGAACAGGGCTAGACTTTAATGAGATGGTGTTGTTGACAATGGAAAAAAAATCATGGCTCGTCCACAAACAATGGTTTCTAGACTTACCACTGCCTTGGACCTCGGGGGCTTCAACATCCCAAGAGACTTGGAATAGACAAGACTTGCTGGTCACATTTAAGACAGCTCATGCAAAAAAGCAGGAAGTAGTCGTACTAGGATCACAAGAAGGAGCAATGCACACTGCGTTGACTGGAGCGACAGAAATCCAATCGTCTGGAACGACAACAATTTTTGCAGGACACCTGAAATGCAGACTAAAAATGGATAAACTGACTTTAAAAGGGATGTCATATGTAATGTGCACAGGGTCATTCAAGTTAGAGAAGGAAGTGGCTGAGACCCAGCATGGAACTGTTCTAGTGCAGGTTAAATACGAAGGAACAGATGCACCATGCAAGATCCCCTTCTCGTCCCAAGATGAGAAGGGAGTAACCCAGAATGGGAGATTGATAACAGCCAACCCCATAGTCACTGACAAAGAAAAACCAGTCAACATTGAAGCGGAGCCACCTTTTGGTGAGAGCTACATTGTGGTAGGAGCAGGTGAAAAAGCTTTGAAACTAAGCTGGTTCAAGAAGGGAAGCAGTATAGGGAAAATGTTTGAAGCAACTGCCCGTGGAGCACGAAGGATGGCCATCCTGGGAGACACTGCATGGGACTTCGGTTCTATAGGAGGGGTGTTCACGTCTGTGGGAAAACTGATACACCAGATTTTTGGGACTGCGTATGGAGTTTTGTTCAGCGGTGTTTCTTGGACCATGAAGATAGGAATAGGGATTCTGCTGACATGGCTAGGATTAAACTCAAGGAGCACGTCCCTTTCAATGACGTGTATCGCAGTTGGCATGGTCACACTGTACCTAGGAGTCATGGTTCAGGCGGACTCGGGATGTGTAATCAACTGGAAAGGCAGAGAACTCAAATGTGGAAGCGGCATTTTTGTCACCAATGAAGTCCACACCTGGACAGAGCAATATAAATTCCAGGCCGACTCCCCTAAGAGACTATCAGCGGCCATTGGGAAGGCATGGGAGGAGGGTGTGTGTGGAATTCGATCAGCCACTCGTCTCGAGAACATCATGTGGAAGCAAATATCAAATGAATTAAACCACATCTTACTTGAAAATGACATGAAATTTACAGTGGTCGTAGGAGACGTTAGTGGAATCTTGGCCCAAGGAAAGAAAATGATTAGGCCACAACCCATGGAACACAAATACTCGTGGAAAAGCTGGGGAAAAGCCAAAATCATAGGAGCAGATGTACAGAATACCACCTTCATCATCGACGGCCCAAACACCCCAGAATGCCCTGATAACCAAAGAGCATGGAACATTTGGGAAGTTGAAGACTATGGATTTGGAATTTTCACGACAAACATATGGTTGAAATTGCGTGACTCCTACACTCAAGTGTGTGACCACCGGCTAATGTCAGCTGCCATCAAGGATAGCAAAGCAGTCCATGCTGACATGGGGTACTGGATAGAAAGTGAAAAGAACGAGACTTGGAAGTTGGCAAGAGCCTCCTTCATAGAAGTTAAGACATGCATCTGGCCAAAATCCCACACTCTATGGAGCAATGGAGTCCTGGAAAGTGAGATGATAATCCCAAAGATATATGGAGGACCAATATCTCAGCACAACTACAGACCAGGATATTTCACACAAACAGCAGGGCCGTGGCACTTGGGCAAGTTAGAACTAGATTTTGATTTATGTGAAGGTACCACTGTTGTTGTGGATGAACATTGTGGAAATCGAGGACCATCTCTTAGAACCACAACAGTCACAGGAAAGACAATCCATGAATGGTGCTGTAGATCTTGCACGTTACCCCCCCTACGTTTCAAAGGAGAAGACGGGTGCTGGTACGGCATGGAAATCAGACCAGTCAAGGAGAAGGAAGAGAACCTAGTTAAGTCAATGGTCTCTGCAGGGTCAGGAGAAGTGGACAGTTTTTCACTAGGACTGCTATGCATATCAATAATGATCGAAGAGGTAATGAGATCCAGATGGAGCAGAAAAATGCTGATGACTGGAACATTGGCTGTGTTCCTCCTTCTCACAATGGGACAATTGACATGGAATGATCTGATCAGGCTATGTATCATGGTTGGAGCCAACGCTTCAGACAAGATGGGGATGGGAACAACGTACCTAGCTTTGATGGCCACTTTCAGAATGAGACCAATGTTCGCAGTCGGGCTACTGTTTCGCAGATTAACATCTAGAGAAGTTCTTCTTCTTACAGTTGGATTGAGTCTGGTGGCATCTGTAGAACTACCAAATTCCTTAGAGGAGCTAGGGGATGGACTTGCAATGGGCATCATGATGTTGAAATTACTGACTGATTTTCAGTCACATCAGCTATGGGCTACCTTGCTGTCTTTAACATTTGTCAAAACAACTTTTTCATTGCACTATGCATGGAAGACAATGGCTATGATACTGTCAATTGTATCTCTCTTCCCTTTATGCCTGTCCACGACTTCTCAAAAAACAACATGGCTTCCGGTGTTGCTGGGATCTCTTGGATGCAAACCACTAACCATGTTTCTTATAACAGAAAACAAAATCTGGGGAAGGAAAAGCTGGCCTCTCAATGAAGGAATTATGGCTGTTGGAATAGTTAGCATTCTTCTAAGTTCACTTCTCAAGAATGATGTGCCACTAGCTGGCCCACTAATAGCTGGAGGCATGCTAATAGCATGTTATGTCATATCTGGAAGCTCGGCCGATTTATCACTGGAGAAAGCGGCTGAGGTCTCCTGGGAAGAAGAAGCAGAACACTCTGGTGCCTCACACAACATACTAGTGGAGGTCCAAGATGATGGAACCATGAAAATAAAGGATGAAGAGAGAGATGACACACTCACCATTCTCCTCAAAGCAACTCTGCTAGCAATCTCAGGGGTATACCCAATGTCAATACCGGCGACCCTCTTTGTGTGGTATTTTTGGCAGAAAAAGAAACAGAGATCAGGAGTGCTATGGGACACACCCAGCCCTCCAGAAGTGGAAAGAGCAGTCCTTGATGATGGCATTTATAGAATTCTCCAAAGAGGATTGTTGGGCAGGTCTCAAGTAGGAGTAGGAGTTTTTCAAGAAGGCGTGTTCCACACAATGTGGCACGTCACCAGGGGAGCTGTCCTCATGTACCAAGGGAAGAGACTGGAACCAAGTTGGGCCAGTGTCAAAAAAGACTTGATCTCATATGGAGGAGGTTGGAGGTTTCAAGGATCCTGGAACGCGGGAGAAGAAGTGCAGGTGATTGCTGTTGAACCGGGGAAGAACCCCAAAAATGTACAGACAGCGCCGGGTACCTTCAAGACCCCTGAAGGCGAAGTTGGAGCCATAGCTCTAGACTTTAAACCCGGCACATCTGGATCTCCTATCGTGAACAGAGAGGGAAAAATAGTAGGTCTTTATGGAAATGGAGTGGTGACAACAAGTGGTACCTACGTCAGTGCCATAGCTCAAGCTAAAGCATCACAAGAAGGGCCTCTACCAGAGATTGAGGACGAGGTGTTTAGGAAAAGAAACTTAACAATAATGGACCTACATCCAGGATCGGGAAAAACAAGAAGATACCTTCCAGCCATAGTCCGTGAGGCCATAAAAAGAAAGCTGCGCACGCTAGTCTTAGCTCCCACAAGAGTTGTCGCTTCTGAAATGGCAGAGGCGCTCAAGGGAATGCCAATAAGGTATCAGACAACAGCAGTGAAGAGTGAACACACGGGAAAGGAGATAGTTGACCTTATGTGTCACGCCACTTTCACTATGCGTCTCCTGTCTCCTGTGAGAGTTCCCAATTATAATATGATTATCATGGATGAAGCACATTTCACCGATCCAGCCAGCATAGCAGCCAGAGGGTATATCTCAACCCGAGTGGGTATGGGTGAAGCAGCTGCGATTTTCATGACAGCCACTCCCCCCGGATCGGTGGAGGCCTTTCCACAGAGCAATGCAGTTATCCAAGATGAGGAAAGAGACATTCCTGAAAGATCATGGAACTCAGGCTATGACTGGATCACTGATTTCCCAGGTAAAACAGTCTGGTTTGTTCCAAGCATCAAATCAGGAAATGACATTGCCAACTGTTTAAGAAAGAATGGGAAACGGGTGGTCCAATTGAGCAGAAAAACTTTTGACACTGAGTACCAGAAAACAAAAAATAACGACTGGGACTATGTTGTCACAACAGACATATCCGAAATGGGAGCAAACTTCCGAGCCGACAGGGTAATAGACCCGAGGCGGTGCCTGAAACCGGTAATACTAAAAGATGGCCCAGAGCGTGTCATTCTAGCCGGACCGATGCCAGTGACTGTGGCTAGCGCCGCCCAGAGGAGAGGAAGAATTGGAAGGAACCAAAATAAGGAAGGCGATCAGTATATTTACATGGGACAGCCTCTAAAAAATGATGAGGACCACGCCCATTGGACAGAAGCAAAAATGCTCCTTGACAACATAAACACACCAGAAGGGATTATCCCAGCCCTCTTTGAGCCGGAGAGAGAAAAGAGTGCAGCAATAGACGGGGAATACAGACTACGGGGTGAAGCGAGGAAAACGTTCGTGGAGCTCATGAGAAGAGGAGATCTACCTGTCTGGCTATCCTACAAAGTTGCCTCAGAAGGCTTCCAGTACTCCGACAGAAGGTGGTGCTTTGATGGGGAAAGGAACAACCAGGTGTTGGAGGAGAACATGGACGTGGAGATCTGGACAAAAGAAGGAGAAAGAAAGAAACTACGACCCCGCTGGCTGGATGCCAGAACATACTCTGACCCACTGGCTCTGCGCGAATTCAAAGAGTTCGCAGCAGGAAGAAGAAGCGTCTCAGGTGACCTAATATTAGAAATAGGGAAACTTCCACAACATTTAACGCAAAGGGCCCAGAACGCCTTGGACAATCTGGTTATGTTGCACAACTCTGAACAAGGAGGAAAAGCCTATAGACACGCCATGGAAGAACTACCAGACACCATAGAAACGTTAATGCTCCTAGCTTTGATAGCTGTGCTGACTGGTGGAGTGACGTTGTTCTTCCTATCAGGAAGGGGTCTAGGAAAAACATCCATTGGCCTACTCTGCGTGATTGCCTCAAGTGCACTGTTATGGATGGCCAGTGTGGAACCCCATTGGATAGCGGCCTCTATCATACTGGAGTTCTTTCTGATGGTGTTGCTTATTCCAGAGCCGGACAGACAGCGCACTCCACAAGACAACCAGCTAGCATACGTGGTGATAGGTCTGTTATTCATGATATTGACAGTGGCAGCCAATGAGATGGGATTACTGGAAACCACAAAGAAGGACCTGGGGATTGGTCATGCAGCTGCTGAAAACCACCATCATGCTGCAATGCTGGACGTAGACCTACATCCAGCTTCAGCCTGGACTCTCTATGCAGTGGCCACAACAATTATCACTCCCATGATGAGACACACAATTGAAAACACAACGGCAAATATTTCCCTGACAGCTATTGCAAACCAGGCAGCTATATTGATGGGACTTGACAAGGGATGGCCAATATCAAAGATGGACATAGGAGTTCCACTTCTCGCCTTGGGGTGCTATTCTCAGGTGAACCCGCTGACGCTGACAGCGGCGGTATTTATGCTAGTGGCTCATTATGCCATAATTGGACCCGGACTGCAAGCAAAAGCTACTAGAGAAGCTCAAAAAAGGACAGCAGCCGGAATAATGAAAAACCCAACTGTCGACGGGATCGTTGCAATAGATTTGGACCCTGTGGTTTACGATGCAAAATTTGAAAAACAGCTAGGCCAAATAATGTTGTTGATACTTTGCACATCACAGATCCTCCTGATGCGGACCACATGGGCCTTGTGTGAATCCATCACACTAGCCACTGGACCTCTGACCACGCTTTGGGAGGGATCTCCAGGAAAATTCTGGAACACCACGATAGCGGTGTCCATGGCAAACATTTTTAGGGGAAGTTATCTAGCAGGAGCAGGTCTGGCCTTTTCATTAATGAAATCTCTAGGAGGAGGTAGGAGAGGCACGGGAGCCCAAGGGGAAACACTGGGAGAAAAATGGAAAAGACAGCTAAACCAATTGAGCAAGTCAGAATTCAACACTTACAAAAGGAGTGGGATTATAGAGGTGGATAGATCTGAAGCCAAAGAGGGGTTAAAAAGAGGAGAAACGACTAAACACGCAGTGTCGAGAGGAACGGCCAAACTGAGGTGGTTTGTGGAGAGGAACCTTGTGAAACCAGAAGGGAAAGTCATAGACCTCGGTTGTGGAAGAGGTGGCTGGTCATATTATTGCGCTGGGCTGAAGAAAGTCACAGAAGTGAAAGGATACACGAAAGGAGGACCTGGACATGAGGAACCAATCCCAATGGCAACCTATGGATGGAACCTAGTAAAGCTATACTCCGGGAAAGATGTATTCTTTACACCACCTGAGAAATGTGACACCCTCTTGTGTGATATTGGTGAGTCCTCTCCGAACCCAACTATAGAAGAAGGAAGAACGTTACGTGTTCTAAAGATGGTGGAACCATGGCTCAGAGGAAACCAATTTTGCATAAAAATTCTAAATCCCTATATGCCGAGTGTGGTAGAAACTTTGGAGCAAATGCAAAGAAAACATGGAGGAATGCTAGTGCGAAATCCACTCTCAAGAAACTCCACTCATGAAATGTACTGGGTTTCATGTGGAACAGGAAACATTGTGTCAGCAGTAAACATGACATCTAGAATGCTGCTAAATCGATTCACAATGGCTCACAGGAAGCCAACATATGAAAGAGACGTGGACTTAGGCGCTGGAACAAGACATGTGGCAGTAGAACCAGAGGTGGCCAACCTAGATATCATTGGCCAGAGGATAGAGAATATAAAAAATGAACACAAATCAACATGGCATTATGATGAGGACAATCCATACAAAACATGGGCCTATCATGGATCATATGAGGTCAAGCCATCAGGATCAGCCTCATCCATGGTCAATGGTGTGGTGAGACTGCTAACCAAACCATGGGATGTCATTCCCATGGTCACACAAATAGCCATGACTGACACCACACCCTTTGGACAACAGAGGGTGTTTAAAGAGAAAGTTGACACGCGTACACCAAAAGCGAAACGAGGCACAGCACAAATTATGGAGGTGACAGCCAGGTGGTTATGGGGTTTTCTCTCTAGAAACAAAAAACCCAGAATCTGCACAAGAGAGGAGTTCACAAGAAAAGTCAGGTCAAACGCAGCTATTGGAGCAGTGTTCGTTGATGAAAATCAATGGAACTCAGCAAAAGAGGCAGTGGAAGATGAACGGTTCTGGGACCTTGTGCACAGAGAGAGGGAGCTTCATAAACAAGGAAAATGTGCCACGTGTGTCTACAACATGATGGGAAAGAGAGAGAAAAAATTAGGAGAGTTCGGAAAGGCAAAAGGAAGTCGCGCAATATGGTACATGTGGTTGGGAGCGCGCTTTTTAGAGTTTGAAGCCCTTGGTTTCATGAATGAAGATCACTGGTTCAGCAGAGAGAATTCACTCAGTGGAGTGGAAGGAGAAGGACTCCACAAACTTGGATACATACTCAGAGACATATCAAAGATTCCAGGGGGAAATATGTATGCAGATGACACAGCCGGATGGGACACAAGAATAACAGAGGATGATCTTCAGAATGAGGCCAAAATCACTGACATCATGGAACCTGAACATGCCCTATTGGCCACGTCAATCTTTAAGCTAACCTACCAAAACAAGGTAGTAAGGGTGCAGAGACCAGCGAAAAATGGAACCGTGATGGATGTCATATCCAGACGTGACCAGAGAGGAAGTGGACAGGTTGGAACCTATGGCTTAAACACCTTCACCAACATGGAGGCCCAACTAATAAGACAAATGGAGTCTGAGGGAATCTTTTCACCCAGCGAATTGGAAACCCCAAATCTAGCCGAAAGAGTCCTCGACTGGTTGAAAAAACATGGCACCGAGAGGCTGAAAAGAATGGCAATCAGTGGAGATGACTGTGTGGTGAAACCAATTGATGACAGATTTGCAACAGCCTTAACAGCTTTGAATGACATGGGAAAGGTAAGAAAAGACATACCGCAATGGGAACCTTCAAAAGGATGGAATGATTGGCAACAAGTGCCTTTCTGTTCACACCATTTCCACCAGCTGATTATGAAGGATGGGAGGGAGATAGTGGTGCCATGCCGCAACCAAGATGAACTTGTAGGTAGGGCCAGAGTATCACAAGGCGCCGGATGGAGCTTGAGAGAAACTGCATGCCTAGGCAAGTCATATGCACAAATGTGGCAGCTGATGTACTTCCACAGGAGAGACTTGAGATTAGCGGCTAATGCTATCTGTTCAGCCGTTCCAGTTGATTGGGTCCCAACCAGCCGTACCACCTGGTCGATCCATGCCCACCATCAATGGATGACAACAGAAGACATGTTG…'

   1  SC*STWTDKN SFESEACLT* F*QFFIREQI SDEQPTEKDG STVFQYAETR EKPRVNCFTV
  61  GEEILKRIAF RPRTHEIGDG FYSIPKISSH TSNSRNFG*M GLIQEEWSDQ SVTGFQERNL
 121  KHVEHNEQEE KICDHAPHAA AHSPGVPSDH PRGRAAHDS* QAGKRKITFV *DLCRCQHVH
 181  PYCNGFGRVM *GHNDLQMPP DH*DGTR*R* LLVQCHGDMG DLWNMFSNW* TPTRQTFRRT
 241  GTTRRAWSRN KNRNVDVL*R RLETNTKSGD LGSETPRIHG DSPFSSTCHR NIHHPERDHF
 301  YFADAGNSIH GHAVRGNRQQ RLRGRTVRSY VGGCGTGAWK LRHYHGKRQT NTGH*TLEDG
 361  GHKPCRPAQT VH*S*NIKHH HRFEMSNTRR SHAGGRTGHE LCVSTNVRGQ RLGQWLWAIR
 421  KR*LNNVC*V *VCDKTGRKD SPI*KLKIFS DSHRTHWRPA PSWK*DHRTW NNCNHNTSSS
 481  HVGNTADRLR SSNIGLFT*N RARL**DGVV DNGKKIMARP QTMVSRLTTA LDLGGFNIPR
 541  DLE*TRLAGH I*DSSCKKAG SSRTRITRRS NAHCVDWSDR NPIVWNDNNF CRTPEMQTKN
 601  G*TDFKRDVI CNVHRVIQVR EGSG*DPAWN CSSAG*IRRN RCTMQDPLLV PR*EGSNPEW
 661  EIDNSQPHSH *QRKTSQH*S GATFW*ELHC GRSR*KSFET KLVQEGKQYR ENV*SNCPWS
 721  TKDGHPGRHC MGLRFYRRGV HVCGKTDTPD FWDCVWSFVQ RCFLDHEDRN RDSADMARIK
 781  LKEHVPFNDV YRSWHGHTVP RSHGSGGLGM CNQLERQRTQ MWKRHFCHQ* SPHLDRAI*I
 841  PGRLP*ETIS GHWEGMGGGC VWNSISHSSR EHHVEANIK* IKPHLT*K*H EIYSGRRRR*
 901  WNLGPRKEND *ATTHGTQIL VEKLGKSQNH RSRCTEYHLH HRRPKHPRMP **PKSMEHLG
 961  S*RLWIWNFH DKHMVEIA*L LHSSV*PPAN VSCHQG*QSS PC*HGVLDRK *KERDLEVGK
1021  SLLHRS*DMH LAKIPHSMEQ WSPGK*DDNP KDIWRTNISA QLQTRIFHTN SRAVALGQVR
1081  TRF*FM*RYH CCCG*TLWKS RTIS*NHNSH RKDNP*MVL* ILHVTPPTFQ RRRRVLVRHG
1141  NQTSQGEGRE PS*VNGLCRV RRSGQFFTRT AMHINNDRRG NEIQMEQKNA DDWNIGCVPP
1201  SHNGTIDME* SDQAMYHGWS QRFRQDGDGN NVPSFDGHFQ NETNVRSRAT VSQINI*RSS
1261  SSYSWIESGG ICRTTKFLRG ARGWTCNGHH DVEITD*FSV TSAMGYLAVF NICQNNFFIA
1321  LCMEDNGYDT VNCISLPFMP VHDFSKNNMA SGVAGISWMQ TTNHVSYNRK QNLGKEKLAS
1381  Q*RNYGCWNS *HSSKFTSQE *CATSWPTNS WRHANSMLCH IWKLGRFITG ESG*GLLGRR
1441  SRTLWCLTQH TSGGPR*WNH ENKG*RER*H THHSPQSNSA SNLRGIPNVN TGDPLCVVFL
1501  AEKETEIRSA MGHTQPSRSG KSSP**WHL* NSPKRIVGQV SSRSRSFSRR RVPHNVARHQ
1561  GSCPHVPREE TGTKLGQCQK RLDLIWRRLE VSRILERGRR SAGDCC*TGE EPQKCTDSAG
1621  YLQDP*RRSW SHSSRL*TRH IWISYREQRG KNSRSLWKWS GDNKWYLRQC HSSS*SITRR
1681  ASTRD*GRGV *EKKLNNNGP TSRIGKNKKI PSSHSP*GHK KKAAHASLSS HKSCRF*NGR
1741  GAQGNANKVS DNSSEE*THG KGDS*PYVSR HFHYASPVSC ESSQL*YDYH G*STFHRSSQ
1801  HSSQRVYLNP SGYG*SSCDF HDSHSPRIGG GLSTEQCSYP R*GKRHS*KI MELRL*LDH*
1861  FPR*NSLVCS KHQIRK*HCQ LFKKEWETGG PIEQKNF*H* VPENKK*RLG LCCHNRHIRN
1921  GSKLPSRQGN RPEAVPETGN TKRWPRACHS SRTDASDCG* RRPEERKNWK EPK*GRRSVY
1981  LHGTASKK** GPRPLDRSKN AP*QHKHTRR DYPSPL*AGE RKECSNRRGI QTTG*SEENV
2041  RGAHEKRRST CLAILQSCLR RLPVLRQKVV L*WGKEQPGV GGEHGRGDLD KRRRKKETTT
2101  PLAGCQNIL* PTGSARIQRV RSRKKKRLR* PNIRNRETST TFNAKGPERL GQSGYVAQL*
2161  TRRKSL*TRH GRTTRHHRNV NAPSFDSCAD WWSDVVLPIR KGSRKNIHWP TLRDCLKCTV
2221  MDGQCGTPLD SGLYHTGVLS DGVAYSRAGQ TAHSTRQPAS IRGDRSVIHD IDSGSQ*DGI
2281  TGNHKEGPGD WSCSC*KPPS CCNAGRRPTS SFSLDSLCSG HNNYHSHDET HN*KHNGKYF
2341  PDSYCKPGSY IDGT*QGMAN IKDGHRSSTS RLGVLFSGEP ADADSGGIYA SGSLCHNWTR
2401  TASKSY*RSS KKDSSRNNEK PNCRRDRCNR FGPCGLRCKI *KTARPNNVV DTLHITDPPD
2461  ADHMGLV*IH HTSHWTSDHA LGGISRKILE HHDSGVHGKH F*GKLSSRSR SGLFINEISR
2521  RR*ERHGSPR GNTGRKMEKT AKPIEQVRIQ HLQKEWDYRG G*I*SQRGVK KRRND*TRSV
2581  ERNGQTEVVC GEEPCETRRE SHRPRLWKRW LVILLRWAEE SHRSERIHER RTWT*GTNPN
2641  GNLWMEPSKA ILRERCILYT T*EM*HPLV* YW*VLSEPNY RRRKNVTCSK DGGTMAQRKP
2701  ILHKNSKSLY AECGRNFGAN AKKTWRNASA KSTLKKLHS* NVLGFMWNRK HCVSSKHDI*
2761  NAAKSIHNGS QEANI*KRRG LRRWNKTCGS RTRGGQPRYH WPEDREYKK* TQINMAL**G
2821  QSIQNMGLSW II*GQAIRIS LIHGQWCGET ANQTMGCHSH GHTNSHD*HH TLWTTEGV*R
2881  ES*HAYTKSE TRHSTNYGGD SQVVMGFSL* KQKTQNLHKR GVHKKSQVKR SYWSSVR**K
2941  SMELSKRGSG R*TVLGPCAQ REGAS*TRKM CHVCLQHDGK EREKIRRVRK GKRKSRNMVH
3001  VVGSALFRV* SPWFHE*RSL VQQREFTQWS GRRRTPQTWI HTQRHIKDSR GKYVCR*HSR
3061  MGHKNNRG*S SE*GQNH*HH GT*TCPIGHV NL*ANLPKQG SKGAETSEKW NRDGCHIQT*
3121  PERKWTGWNL WLKHLHQHGG PTNKTNGV*G NLFTQRIGNP KSSRKSPRLV EKTWHREAEK
3181  NGNQWR*LCG ETN**QICNS LNSFE*HGKG KKRHTAMGTF KRME*LATSA FLFTPFPPAD
3241  YEGWEGDSGA MPQPR*TCR* GQSITRRRME LERNCMPRQV ICTNVAADVL PQERLEISG*
3301  CYLFSRSS*L GPNQPYHLVD PCPPSMDDNR RHVVSVE*GL DRGKPMDGGQ DSCVQLGRRS
3361  IPRKKGRSMV WIPNRLNSTS HLGHQHTSGH KPSEKAHWE* ELSRLHDINE EIQKRE*SRR
3421  GTLVSQLIHK IKENKKSNKA RSQAGLSHST VRAMLPVSPV QGRKMKSGRK PRFEQAVLPV
3481  APSWGCKNPG GCKPWKLYAW GSRLVVRGDP SQDTTQQRGP RLEVRGDPPH NNKQHIDAGR
3541  DQRSCCLYSI IPGTERQKME WCC*INRF                                   

codons = struct with fields:
    AAA: 132
    AAC: 91
    AAG: 74
    AAT: 104
    ACA: 66
    ACC: 54
    ACG: 20
    ACT: 60
    AGA: 135
    AGC: 114
    AGG: 123
    AGT: 84
    ATA: 24
    ATC: 57
    ATG: 61
    ATT: 52
    CAA: 86
    CAC: 101
    CAG: 56
    CAT: 108
    CCA: 65
    CCC: 34
    CCG: 14
    CCT: 51
    CGA: 27
    CGC: 23
    CGG: 23
    CGT: 31
    CTA: 23
    CTC: 38
    CTG: 40
    CTT: 50
    GAA: 108
    GAC: 68
    GAG: 56
    GAT: 73
    GCA: 29
    GCC: 24
    GCG: 13
    GCT: 64
    GGA: 109
    GGC: 66
    GGG: 55
    GGT: 60
    GTA: 19
    GTC: 35
    GTG: 37
    GTT: 61
    TAA: 25
    TAC: 15
    TAG: 26
    TAT: 40
    TCA: 33
    TCC: 35
    TCG: 10
    TCT: 54
    TGA: 110
    TGC: 51
    TGG: 88
    TGT: 61
    TTA: 14
    TTC: 32
    TTG: 26
    TTT: 50

DVSeq = 'SC*STWTDKNSFESEACLT*F*QFFIREQISDEQPTEKDGSTVFQYAETREKPRVNCFTVGEEILKRIAFRPRTHEIGDGFYSIPKISSHTSNSRNFG*MGLIQEEWSDQSVTGFQERNLKHVEHNEQEEKICDHAPHAAAHSPGVPSDHPRGRAAHDS*QAGKRKITFV*DLCRCQHVHPYCNGFGRVM*GHNDLQMPPDH*DGTR*R*LLVQCHGDMGDLWNMFSNW*TPTRQTFRRTGTTRRAWSRNKNRNVDVL*RRLETNTKSGDLGSETPRIHGDSPFSSTCHRNIHHPERDHFYFADAGNSIHGHAVRGNRQQRLRGRTVRSYVGGCGTGAWKLRHYHGKRQTNTGH*TLEDGGHKPCRPAQTVH*S*NIKHHHRFEMSNTRRSHAGGRTGHELCVSTNVRGQRLGQWLWAIRKR*LNNVC*V*VCDKTGRKDSPI*KLKIFSDSHRTHWRPAPSWK*DHRTWNNCNHNTSSSHVGNTADRLRSSNIGLFT*NRARL**DGVVDNGKKIMARPQTMVSRLTTALDLGGFNIPRDLE*TRLAGHI*DSSCKKAGSSRTRITRRSNAHCVDWSDRNPIVWNDNNFCRTPEMQTKNG*TDFKRDVICNVHRVIQVREGSG*DPAWNCSSAG*IRRNRCTMQDPLLVPR*EGSNPEWEIDNSQPHSH*QRKTSQH*SGATFW*ELHCGRSR*KSFETKLVQEGKQYRENV*SNCPWSTKDGHPGRHCMGLRFYRRGVHVCGKTDTPDFWDCVWSFVQRCFLDHEDRNRDSADMARIKLKEHVPFNDVYRSWHGHTVPRSHGSGGLGMCNQLERQRTQMWKRHFCHQ*SPHLDRAI*IPGRLP*ETISGHWEGMGGGCVWNSISHSSREHHVEANIK*IKPHLT*K*HEIYSGRRRR*WNLGPRKEND*ATTHGTQILVEKLGKSQNHRSRCTEYHLHHRRPKHPRMP**PKSMEHLGS*RLWIWNFHDKHMVEIA*LLHSSV*PPANVSCHQG*QSSPC*HGVLDRK*KERDLEVGKSLLHRS*DMHLAKIPHSMEQWSPGK*DDNPKDIWRTNISAQLQTRIFHTNSRAVALGQVRTRF*FM*RYHCCCG*TLWKSRTIS*NHNSHRKDNP*MVL*ILHVTPPTFQRRRRVLVRHGNQTSQGEGREPS*VNGLCRVRRSGQFFTRTAMHINNDRRGNEIQMEQKNADDWNIGCVPPSHNGTIDME*SDQAMYHGWSQRFRQDGDGNNVPSFDGHFQNETNVRSRATVSQINI*RSSSSYSWIESGGICRTTKFLRGARGWTCNGHHDVEITD*FSVTSAMGYLAVFNICQNNFFIALCMEDNGYDTVNCISLPFMPVHDFSKNNMASGVAGISWMQTTNHVSYNRKQNLGKEKLASQ*RNYGCWNS*HSSKFTSQE*CATSWPTNSWRHANSMLCHIWKLGRFITGESG*GLLGRRSRTLWCLTQHTSGGPR*WNHENKG*RER*HTHHSPQSNSASNLRGIPNVNTGDPLCVVFLAEKETEIRSAMGHTQPSRSGKSSP**WHL*NSPKRIVGQVSSRSRSFSRRRVPHNVARHQGSCPHVPREETGTKLGQCQKRLDLIWRRLEVSRILERGRRSAGDCC*TGEEPQKCTDSAGYLQDP*RRSWSHSSRL*TRHIWISYREQRGKNSRSLWKWSGDNKWYLRQCHSSS*SITRRASTRD*GRGV*EKKLNNNGPTSRIGKNKKIPSSHSP*GHKKKAAHASLSSHKSCRF*NGRGAQGNANKVSDNSSEE*THGKGDS*PYVSRHFHYASPVSCESSQL*YDYHG*STFHRSSQHSSQRVYLNPSGYG*SSCDFHDSHSPRIGGGLSTEQCSYPR*GKRHS*KIMELRL*LDH*FPR*NSLVCSKHQIRK*HCQLFKKEWETGGPIEQKNF*H*VPENKK*RLGLCCHNRHIRNGSKLPSRQGNRPEAVPETGNTKRWPRACHSSRTDASDCG*RRPEERKNWKEPK*GRRSVYLHGTASKK**GPRPLDRSKNAP*QHKHTRRDYPSPL*AGERKECSNRRGIQTTG*SEENVRGAHEKRRSTCLAILQSCLRRLPVLRQKVVL*WGKEQPGVGGEHGRGDLDKRRRKKETTTPLAGCQNIL*PTGSARIQRVRSRKKKRLR*PNIRNRETSTTFNAKGPERLGQSGYVAQL*TRRKSL*TRHGRTTRHHRNVNAPSFDSCADWWSDVVLPIRKGSRKNIHWPTLRDCLKCTVMDGQCGTPLDSGLYHTGVLSDGVAYSRAGQTAHSTRQPASIRGDRSVIHDIDSGSQ*DGITGNHKEGPGDWSCSC*KPPSCCNAGRRPTSSFSLDSLCSGHNNYHSHDETHN*KHNGKYFPDSYCKPGSYIDGT*QGMANIKDGHRSSTSRLGVLFSGEPADADSGGIYASGSLCHNWTRTASKSY*RSSKKDSSRNNEKPNCRRDRCNRFGPCGLRCKI*KTARPNNVVDTLHITDPPDADHMGLV*IHHTSHWTSDHALGGISRKILEHHDSGVHGKHF*GKLSSRSRSGLFINEISRRR*ERHGSPRGNTGRKMEKTAKPIEQVRIQHLQKEWDYRGG*I*SQRGVKKRRND*TRSVERNGQTEVVCGEEPCETRRESHRPRLWKRWLVILLRWAEESHRSERIHERRTWT*GTNPNGNLWMEPSKAILRERCILYTT*EM*HPLV*YW*VLSEPNYRRRKNVTCSKDGGTMAQRKPILHKNSKSLYAECGRNFGANAKKTWRNASAKSTLKKLHS*NVLGFMWNRKHCVSSKHDI*NAAKSIHNGSQEANI*KRRGLRRWNKTCGSRTRGGQPRYHWPEDREYKK*TQINMAL**GQSIQNMGLSWII*GQAIRISLIHGQWCGETANQTMGCHSHGHTNSHD*HHTLWTTEGV*RES*HAYTKSETRHSTNYGGDSQVVMGFSL*KQKTQNLHKRGVHKKSQVKRSYWSSVR**KSMELSKRGSGR*TVLGPCAQREGAS*TRKMCHVCLQHDGKEREKIRRVRKGKRKSRNMVHVVGSALFRV*SPWFHE*RSLVQQREFTQWSGRRRTPQTWIHTQRHIKDSRGKYVCR*HSRMGHKNNRG*SSE*GQNH*HHGT*TCPIGHVNL*ANLPKQGSKGAETSEKWNRDGCHIQT*PERKWTGWNLWLKHLHQHGGPTNKTNGV*GNLFTQRIGNPKSSRKSPRLVEKTWHREAEKNGNQWR*LCGETN**QICNSLNSFE*HGKGKKRHTAMGTFKRME*LATSAFLFTPFPPADYEGWEGDSGAMPQPR*TCR*GQSITRRRMELERNCMPRQVICTNVAADVLPQERLEISG*CYLFSRSS*LGPNQPYHLVDPCPPSMDDNRRHVVSVE*GLDRGKPMDGGQDSCVQLGRRSIPRKKGRSMVWIPNRLNSTSHLGHQHTSGHKPSEKAHWE*ELSRLHDINEEIQKRE*SRRGTLVSQLIHKIKENKKSNKARSQAGLSHSTVRAMLPVSPVQGRKMKSGRKPRFEQAVLPVAPSWGCKNPGGCKPWKLYAWGSRLVVRGDPSQDTTQQRGPRLEVRGDPPHNNKQHIDAGRDQRSCCLYSIIPGTERQKMEWCC*INRF'

DVprotein = 'MNNQRKKTGRPSFNMLKRARNRVSTVSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKVLRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCEDTMTYKCPRITETEPDDVDCWCNATETWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETWALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPTLDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDTNFVCRRTFVDRGWGNGCGLFGKGSLITCAKFKCVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTTATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLTMEKKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQSSGTTTIFAGHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSSQDEKGVTQNGRLITANPIVTDKEKPVNIEAEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLIHQIFGTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEVHTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVSGILAQGKKMIRPQPMEHKYSWKSWGKAKIIGADVQNTTFIIDGPNTPECPDNQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQVCDHRLMSAAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTAGPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKTIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENLVKSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLTMGQLTWNDLIRLCIMVGANASDKMGMGTTYLALMATFRMRPMFAVGLLFRRLTSREVLLLTVGLSLVASVELPNSLEELGDGLAMGIMMLKLLTDFQSHQLWATLLSLTFVKTTFSLHYAWKTMAMILSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLITENKIWGRKSWPLNEGIMAVGIVSILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTLTILLKATLLAISGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQEGVFHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNAGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGAIALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYLPAIVREAIKRKLRTLVLAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMDEAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPSIKSGNDIANCLRKNGKRVVQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILAGPMPVTVASAAQRRGRIGRNQNKEGDQYIYMGQPLKNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDPLALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTGGVTLFFLSGRGLGKTSIGLLCVIASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILTVAANEMGLLETTKKDLGIGHAAAENHHHAAMLDVDLHPASAWTLYAVATTIITPMMRHTIENTTANISLTAIANQAAILMGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVFMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLDPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIIEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVERNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDIGESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNIVSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNEHKSTWHYDEDNPYKTWAYHGSYEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRNKKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQNEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGIFSPSELETPNLAERVLDWLKKHGTERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPFCSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVPTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDQWCGSLIGLTARATWATNIQVAINQVRRLIGNENYLDFMTSMKRFKNESDPEGALW'

ans = struct with fields:
    A: 130
    R: 362
    N: 195
    D: 141
    C: 112
    Q: 142
    E: 164
    G: 290
    H: 209
    I: 133
    L: 191
    K: 206
    M: 61
    F: 82
    P: 164
    S: 330
    T: 200
    W: 88
    Y: 55
    V: 152

denguepro = struct with fields:
                LocusName: 'AAN06983'
      LocusSequenceLength: '3392'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: ''
     LocusGenBankDivision: 'VRL'
    LocusModificationDate: '30-DEC-2002'
               Definition: 'polyprotein precursor [Dengue virus 1].'
                Accession: 'AAN06983'
                  Version: 'AAN06983.1'
                       GI: ''
                  Project: []
                   DBLink: ' DBSOURCE    accession AY145123.1'
                 Keywords: []
                  Segment: []
                   Source: 'Dengue virus 1'
           SourceOrganism: [3×67 char]
                Reference: {[1×1 struct]  [1×1 struct]}
                  Comment: 'Method: conceptual translation supplied by author.'
                 Features: [129×73 char]
                      CDS: [1×1 struct]
                 Sequence: 'mnnqrkktgrpsfnmlkrarnrvstvsqlakrfskgllsgqgpmklvmafiaflrflaipptagilarwgsfkkngaikvlrgfkkeisnmlnimnrrkrsvtmllmllptalafhlttrggephmivskqergksllfktsagvnmctliamdlgelcedtmtykcpritetepddvdcwcnatetwvtygtcsqtgehrrdkrsvalaphvglgletrtetwmssegawkqiqkvetwalrhpgftvialflahaigtsitqkgiifillmlvtpsmamrcvgignrdfveglsgatwvdvvlehgscvttmakdkptldiellktevtnpavlrklcieakisntttdsrcptqgeatlveeqdtnfvcrrtfvdrgwgngcglfgkgslitcakfkcvtklegkivqyenlkysvivtvhtgdqhqvgnettehgttatitpqaptseiqltdygaltldcsprtgldfnemvlltmekkswlvhkqwfldlplpwtsgastsqetwnrqdllvtfktahakkqevvvlgsqegamhtaltgateiqssgtttifaghlkcrlkmdkltlkgmsyvmctgsfklekevaetqhgtvlvqvkyegtdapckipfssqdekgvtqngrlitanpivtdkekpvnieaeppfgesyivvgagekalklswfkkgssigkmfeatargarrmailgdtawdfgsiggvftsvgklihqifgtaygvlfsgvswtmkigigilltwlglnsrstslsmtciavgmvtlylgvmvqadsgcvinwkgrelkcgsgifvtnevhtwteqykfqadspkrlsaaigkaweegvcgirsatrlenimwkqisnelnhillendmkftvvvgdvsgilaqgkkmirpqpmehkyswkswgkakiigadvqnttfiidgpntpecpdnqrawniwevedygfgifttniwlklrdsytqvcdhrlmsaaikdskavhadmgywieseknetwklarasfievktciwpkshtlwsngvlesemiipkiyggpisqhnyrpgyftqtagpwhlgkleldfdlcegttvvvdehcgnrgpslrtttvtgktihewccrsctlpplrfkgedgcwygmeirpvkekeenlvksmvsagsgevdsfslgllcisimieevmrsrwsrkmlmtgtlavfllltmgqltwndlirlcimvganasdkmgmgttylalmatfrmrpmfavgllfrrltsrevllltvglslvasvelpnsleelgdglamgimmlklltdfqshqlwatllsltfvkttfslhyawktmamilsivslfplclsttsqkttwlpvllgslgckpltmflitenkiwgrkswplnegimavgivsillssllkndvplagpliaggmliacyvisgssadlslekaaevsweeeaehsgashnilvevqddgtmkikdeerddtltillkatllaisgvypmsipatlfvwyfwqkkkqrsgvlwdtpsppeveravlddgiyrilqrgllgrsqvgvgvfqegvfhtmwhvtrgavlmyqgkrlepswasvkkdlisygggwrfqgswnageevqviavepgknpknvqtapgtfktpegevgaialdfkpgtsgspivnregkivglygngvvttsgtyvsaiaqakasqegplpeiedevfrkrnltimdlhpgsgktrrylpaivreaikrklrtlvlaptrvvasemaealkgmpiryqttavksehtgkeivdlmchatftmrllspvrvpnynmiimdeahftdpasiaargyistrvgmgeaaaifmtatppgsveafpqsnaviqdeerdiperswnsgydwitdfpgktvwfvpsiksgndianclrkngkrvvqlsrktfdteyqktknndwdyvvttdisemganfradrvidprrclkpvilkdgpervilagpmpvtvasaaqrrgrigrnqnkegdqyiymgqplkndedhahwteakmlldnintpegiipalfepereksaaidgeyrlrgearktfvelmrrgdlpvwlsykvasegfqysdrrwcfdgernnqvleenmdveiwtkegerkklrprwldartysdplalrefkefaagrrsvsgdlileigklpqhltqraqnaldnlvmlhnseqggkayrhameelpdtietlmllaliavltggvtlfflsgrglgktsigllcviassallwmasvephwiaasiilefflmvllipepdrqrtpqdnqlayvvigllfmiltvaanemgllettkkdlgighaaaenhhhaamldvdlhpasawtlyavattiitpmmrhtienttanisltaianqaailmgldkgwpiskmdigvpllalgcysqvnpltltaavfmlvahyaiigpglqakatreaqkrtaagimknptvdgivaidldpvvydakfekqlgqimllilctsqillmrttwalcesitlatgplttlwegspgkfwnttiavsmanifrgsylagaglafslmkslgggrrgtgaqgetlgekwkrqlnqlsksefntykrsgiievdrseakeglkrgettkhavsrgtaklrwfvernlvkpegkvidlgcgrggwsyycaglkkvtevkgytkggpgheepipmatygwnlvklysgkdvfftppekcdtllcdigesspnptieegrtlrvlkmvepwlrgnqfcikilnpympsvvetleqmqrkhggmlvrnplsrnsthemywvscgtgnivsavnmtsrmllnrftmahrkptyerdvdlgagtrhvavepevanldiigqrieniknehkstwhydednpyktwayhgsyevkpsgsassmvngvvrlltkpwdvipmvtqiamtdttpfgqqrvfkekvdtrtpkakrgtaqimevtarwlwgflsrnkkprictreeftrkvrsnaaigavfvdenqwnsakeavederfwdlvhrerelhkqgkcatcvynmmgkrekklgefgkakgsraiwymwlgarflefealgfmnedhwfsrenslsgvegeglhklgyilrdiskipggnmyaddtagwdtriteddlqneakitdimepehallatsifkltyqnkvvrvqrpakngtvmdvisrrdqrgsgqvgtyglntftnmeaqlirqmesegifspseletpnlaervldwlkkhgterlkrmaisgddcvvkpiddrfataltalndmgkvrkdipqwepskgwndwqqvpfcshhfhqlimkdgreivvpcrnqdelvgrarvsqgagwslretaclgksyaqmwqlmyfhrrdlrlaanaicsavpvdwvptsrttwsihahhqwmttedmlsvwnrvwieenpwmedkthvsswedvpylgkredqwcgsligltaratwatniqvainqvrrlignenyldfmtsmkrfknesdpegalw'
                SearchURL: 'https://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&id=AAN06983'
              RetrieveURL: 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=22901066&rettype=gp&retmode=text'

dengueproAC = struct with fields:
    C: 16868
    H: 26719
    N: 4645
    O: 4891
    S: 185

ans = 16868

dengueproMW = 3.7878e+05

Post a Comment

0 Comments