Cигналы в нуклеотидных последовательностях





  1. Задача: найти общий мотив в последовательностях при помощи сервера MEME, этот мотив - предполагаемый сигнал, узнаваемый белком purR.

    Были установлены следующие параметры для MEME:


    В поле name of file - файл с последовательностями, LOGO:



    Матрица PSSM:

    A C G T Pattern
    1 149 -1023 10 -1023 R
    2 -68 110 -48 -68 C
    3 -1023 -148 198 -1023 G
    4 -1023 198 -148 -1023 C
    5 190 -1023 -1023 -1023 A
    6 190 -1023 -1023 -1023 A
    7 90 -148 -148 32 W
    8 -1023 210 -1023 -1023 C
    9 -1023 -1023 210 -1023 G
    10 -168 -1023 -48 149 T
    11 -1023 -1023 -1023 190 T
    12 -1023 -1023 -1023 190 T
    13 -68 -1023 -148 149 T
    14 -168 198 -1023 -1023 C
    15 -168 110 52 -168 S
    16 -10 -48 -1023 113 T


    Таблица найденных мотивов:

    имя
    последовательности
    Цепь (+/-)координата
    первого
    нуклеотида
    P-VALUE
    purE + 163 6.28e-09
    purL + 158 7.77e-09
    cypA + 178 9.61e-09
    codB - 167 1.13e-08
    purR + 189 2.47e-08
    purC - 183 5.59e-08
    carA + 13 6.06e-08
    purA - 128 9.50e-08
    purM - 170 1.93e-07
    folD - 271 2.56e-07
    guaB + 181 5.69e-07
    glnB + 167 6.65e-07


    Исходные последовательности (найденные мотивы подчеркнуты):

    >codB
    tacggacctgaaccgtaggtcggataaggcgctcgcgtcgcatccgacaccatgctcagatgcctgatgcgacgctgacgcgtcttatca
    ggcctacccactgtttttacaccgataatttttcccccacctttttgcactcattcatataaaaaatatatttccccacgaaaacgattg
    ctttttatcttcagatgaatagaatgcggcggattttttgggtttcaaacagcaaaaagggggaatttcgtgtcgcaagataacaacttt
    agccaggggccagtcccgcagtcggcgcgg
    
    >purE
    tcgcccggcggtgcatgaacttatcgccaatcagcaacctgcttttcgcgtggtactgggtgcctggcatacggaaggttcaatggtgaaa
    gtcacggcggatgacgttgagctgattcattttccgttttaaaaaacccgcaactttgctgatttcacagccacgcaaccgttttcct
    tgctctctttccgtgctattctctgtgccctctaaagccgagagttgtgcaccacaggagttttaagacgcatgtcttcccgcaata
    atccggcgcgtgtcgccatcgtgatggggtccaa
    
    >pyrC
    gaaccaggcattacgcaattactttaaccagcaacctgcttacgtcctgcgcgaagatggcagccagggcgaagcaatggcgaaaaaactg
    gcgaaaggcattgaagtgaagccaggcgaaattgtcattccatttactgattaatcacgagggcgcattcgcgccctttatttttcgtgca
    aaggaaaacgtttccgcttatcctttgtgtccggcaaaaacatcccttcagccggagcatagagattaatgactgcaccatcccaggtatt
    aaagatccgccgcccagacgactggca
    
    >purR
    ttacacactgtgatgaaaaaatctcccgtcatttataatgataagtgtttttaccacttccccttttcgtcaagatcggccaaaattccacg
    cttacactatttgcgtactggccattgaccccttcctgacgctccgtgtcgtttttccggcgtaccgcaacacttttgttgtgcgtaaggtg
    tgtaaaggcaaacgtttaccttgcgattttgcaggagctgaagttagggtctggagtgaaatggaatggcaacaataaaagatgtagcgaaa
    cgagcaaacgtttccactacaact
    
    >cvpA
    tgcctgatgcgacgctggcgcgtcttatcaggcctacgcaggggtagaaccgtaggtcggataaggcgtttacgccgcatccgacacgcatt
    gcccgatgccgcaaaggcataaaaagtcgatggcgttgaatattttttcagcgccatttttattgatgcgcgggaaggaaatccctacgcaa
    acgttttctttttctgttagaatgcgccccgaacaggatgacagggcgtaaaatcgtgggacacatatggtctggattgattacgccataat
    cgcggtgattgctttttcctctct
    
    >purM
    ttttcgttgactttagtcaaaatgataacggtttgagataaagttattttatattcagatggttatgaaagaagattattccatccgaaaac
    taacctttaccctggcacaagtcttctttcgccgcgcgcctggggaaaagacgtgcaaaaaggttgtgtaaagcagtctcgcaaacgtttgc
    tttccctgttagaattgcgccgaattttatttttctaccgcaagtaacgcgtggggacccaagcagtgaccgataaaacctctcttagctac
    aaagatgccggtgttgatattgac
    
    >guaB
    acctgtcccatctcatgctcaagcagcagacgaaccgtttgattcaggcgactaacggtaaaaattgcaggggattgagaaggtaacatgtg
    agcgagatcaaattctaaatcagcaggttattcagtcgatagtaacccgcccttcggggatagcaagcattttttgcaaaaaggggtagatg
    caatcggttacgctctgtataatgccgcggcaatatttattaaccactctggtcgagatattgcccatgctacgtatcgctaaagaagctct
    gacgtttgacgacgttctcctcgt
    
    >glnB
    gggtgaaaatacggcgctgccaacctttgttgaggcacgtaatcagtttgaactcaactatttgcgtaagctgctgcaaatcaccaaaggca
    acgtcacccacgcggcgagaatggcggggcgcaaccggacagaattttataaactgctttcccgacacgagctggatgcaaacgatttcaag
    gaatgaattggcgttatgtgttacgtttagcagatcaaaagacaggcgaccttttcaaggaatagcatgaaaaagattgatgcgattataaa
    acccttcaagctggacgatgtccg
    
    >purL
    attctctgtgtcgtgcgcgtcccagcttgaaaaaacgtaataatagtgaaaggtttactcataaatgagcggcattttgcgtaaacctgcgc
    cagatggcaacttattacagccattggcggcacgcgttgctaattcacgatggtgattttatttccacgcaaacggtttcgtcagcgcatca
    gattctttataatgacgcccgtttcccccccttgggtacaccgaaagcttagaagacgagagacttatgatggaaattctgcgtggttcgcc
    tgcactgtcggcattccgaatcaa
    
    >purA
    tagggccgatgctttacccgaaggcatggaagaagatgatctctgcgatgaccaatttgcccgataatattttacgtcgttttggcggtgga
    cttgtggttgcgggcgttgtggtctactacatgttgaggaaaacgattggctgaacaaaaaacagactgatcgaggtcatttttgagtgcaa
    aaagtgctgtaactctgaaaaagcgatggtagaatccatttttaagcaaacggtgattttgaaaaatgggtaacaacgtcgtcgtactgggc
    acccaatggggtgacgaaggtaaa
    
    >folD
    aaattctttttatattgtcaggtatttcttaaattatcttaatccttagacaaggaaataaatcagttccagatttacaacgccatcatgga
    cgaaaaatgaagctttcagtctcagcgacggtgcgcctcaccttcgcaagaggtcgcttcacgcgataaatctgaaacgaaacctgacagcg
    cgccccgcttctgacaaaataggcgcatccccttcgatctacgtaacagatggaatcctctctctgatggcagcaaagattattgacggtaa
    aacgattgcgcagcaggtgcgctc
    
    >rpiA
    ttgaatggcgtggcgttattgcctcaatttgcctgtaaacaggggcttgcgaacggtgaactggtgcgcctgtttgcaccgtggagcggcatac
    ccagaccgttgtatgctttatttgcggggcgaaaggggatgcctgccattgcgcgatattttatggatgagttaaccacgcggcttgccaacgg
    ggtctgaatcgctttttttgtatataatgcgtgtgaaatttcataccacaggcgaaacgatcatgacgcaggatgaattgaaaaaagcagtagg
    atgggcggcacttcagta
    
    
    >carA
    caatcttcttgctgcgcaagcgttttccagaacaggttagatgatctttttgtcgcttaatgcctgtaaaacatgcatgagccacaaaataa
    tataaaaaatcccgccattaagttgacttttagcgcccatatctccagaatgccgccgtttgccagaaattcgtcggtaagcagatttgcat
    tgatttacgtcatcattgtgaattaatatgcaaataaagtgagtgaatattctctggagggtgttttgattaagtcagcgctattggttctg
    gaagacggaacccagtttcacggt
    


  2. Сравнение результатов с реальными сайтами узнавания PurR.

    Программа MEME ошиблась в выборе только 2х сайтов узнавания: purR и purA.

    Для purR это:

    >purR
    ggcgtaccgcaacacttttgttgtgcgtaaggtgtgtaaaggcaaacgtttaccttgcgattttgcaggagctgaagttagggtc
    


    Для purA:
    >purA
    aggtcatttttgagtgcaaaaagtgctgtaactctgaaaaagcgatggtagaatccatttttaagcaaacggtgattttgaaaaa
    
    


    Всего предсказано: 12
    Правильно предсказано: 8

    Чувствительность = правильно предсказанных / всего правильных = 8/10 = 0.8
    Специфичность = правильно предсказанных / всего предсказанных = 8/12= 0.667