- #1
DanielT29
- 16
- 0
Guys I need major help with this assignment to be written in C code. I'm a beginner at this, so please explain what I should do in detail. Thanks!
Caveat
The solution to this assignment should be a C program necessarily structured as a main function.
The problem
Suppose a DNA strand consists of a sequence of genes. Each gene is a sequence of four types
of nucleotides: Adenine (A), Thymine (T), Cytosine (C), Guanine (G). Moreover, a gene has
two well defined, adjacent regions: a coding region called exon, located in the beginning of the
gene, and a non-coding region called intron, located in the immediate tail of the gene. If a gene
contains h nucleotides in the exon region, then the length t of the intron region (i.e., the number
of nucleotides in it) can be determined as follows:
t = 3h + 1
For instance, Figure 1 shows a DNA strand of a fictitious organism. It contains 2 genes and each
gene has 3 nucleotides in the exon region (shown underlined in the figure).
GATAD{GAATGCC
First Gene
CCTCGTAGTTGAC
Second Gene
In view of the above, your C main function should:
1. scan a data file containing samples of DNA strands of (fictitious) organisms: in this data
file, for each sample there is a line of data with the following fields separated by blanks -
sample ID (integer), number of genes in the DNA strand (integer), length of the exon portion
of each gene (integer), a sequence of characters representing the DNA strand. Note: you
should NOT open a le to read data from it using C commands fopen and fscanf. Instead,
you should set up your C development environment (Quincy) to take a data file as the source of input, and use the C function scanf to read the data. If you do not know how to do that, you should consult the Discussion Board, your TA, or your Instructor.
2. determine the type of the DNA strand: the DNA type depends on the mean number and
type of nucleotides in the exon regions of the genes present in the DNA strand, as shown in
Table 1:
DNA TYPE | CRITERIA
1 | C<A<T<G
2 | A<C<,G>0
3 | other wise
*The underlines on the letters means the mean(average), while on the < means equal to in this case greater and equal to. (I had to make the table since it couldn't copy properly from the page I was viewing it in.
Table 1: DNA types and criteria. In the Criteria column, A; T;C; and G denote the mean number of the respective nucleotides present in the exon regions of a DNA strand (i.e., A denotes the mean number of Adenine in the exon regions of the strand).
3. produce separate printouts for all the samples contained in each data file: each printout
will present for each strand, on separate lines, the ID number of the strand, the values of
A; T; G; and C, and the type of the strand. It should also include the name of the data file.
The data should be labeled and formatted exactly as shown in Figure 2.
Data File: dataFile01.txt
ID mA mT mG mC Type
684 0.27 0.30 0.20 0.24 3
465 0.26 0.25 0.24 0.24 3
131 0.21 0.21 0.30 0.27 3
* I typed up 3 lines for time's sake, there was about 7, this is not the data being read, instead this is an example how I should make my program print out the data.
*the numbers integers that have 3 digits are the sample numbers, the decimal numbers are the mean values of each nucleotide and the single integer number(all 3s) are the type numbers. Sorry guys the columns are not aligned properly.
* What I'm having trouble with is, how do I know which parts are A T G and C on the DNA sequence, because if I don't know this I can't perform an operation to get their mean values to assign a type. Also how do I write a program that knows when its done reading one line of a DNA sequence den moves onto the next as a brand new one;how do I print out my values like in figure 2 to and how do i use prinft and scanf to read and write my data ( is this redirection?). Thanks guys, if I know some idea of this I can start on it, right now I'm having a big programmers block in my head preventing me from even starting this, I'm stuck on int main void.. I need help guys thanks. An algorithm would be nice(not the C code its self). I will be updating you guys on the code I write, hopefully have it done by Monday night (June 1st). Thanks a lot guys I appreciate your help.
*EDIT May 31 2009 11:56pm (EST)
I found a file that came with the assignment I overlooked, maybe you guys can help me better by looking at this file. My program basically has to read this file in.
684 30 8 ATCAGAGGGGATTTCGCCGGTCTATCGAAGCTGAATTCATGTATTAACGATTACAACGAATCAGTAAAGAGTCCTTAGACGGTGATACAGCACGGTGGGTCGTGCTTTAGCCTTTTGCTTTCACTGTTCTAGTGATAATGAGGCTCGAAACTCCTGACCATATAAATGAGTACATAGGGACCCAAGGATAGCTATTCTTATTTACATGTATACGCACTCTCCACCTGCAAAGTCCTTTAGCAGATCCCCCACATGTCTCTATTAACCTAGTATATCCGCTTTTCATGGCTGATCCAACGTAAGGATCTCAGTCGCTCTGGGGTAGAAGTCGCCAATGGGCGTAAACGTAATTTGTTCCGGATTCATATTAACGTAATATAGCAACCTCCGAAACACAATGCGTGAGATTACCTATGTGCTTAACTCTATTTACATCGTGAGAGTTCCGGCAGTTAAGACAGCCCTCTAGTGGAAGGGGCTCCACCACAAATTTGTCTCCGCTTGAGAGAAATTGGATCGACCGTCCGTGAGGACCCGCCGCTGTTCACAGCCAAAGTAAAATGGTATAAAACCGGCGGTATCACTCAAACTTGCCCCATCATCTAAATGAGGCGATAGAATAGGCCTCACTCCTTTTTCGGGCACCCATGAACCCCCACGCCGTACTTACTCGCGAAGTCCCAGTAATTAGACAGCCGTGGGAGATCGTGAGGTCTAAGCGCCCGCACTCAACCATGGGACTGCGAAAGATAGAATAATCTGACATCCGAGAAGTTCCTCGATCCGAAGACGAGAAAGTTCTCAAGCGGCTACCGAACCTTCCTCTGCTGGACAGGTGCTGCGGTCCCGAAGTTAGCCCGTCTCATGAAAAACCAAACGCCTTGCTTTCAGTTATTAACGTCATCTGACAACCCGAAGCATTAGGTGAGAACGCCCCGGCGCCTTGCGCCGGTCCTGATGTTCTGCTCAATCCCCGTAACCTGCGAGGCC
465 87 7 GTGTAGTTGCTCCGACAGACCCGAAACCCCCAACTTTCTGGAACTTTTCTGTATAGCAGGGCCTAGATATTCCGAAATCTATGCTGCCCTCTCTCCATGATTCGCGTTAGCACTTTACATCTACCCCAATAGAATGACATTGGGCTCTACCCTCACGGGTCACATAGCGGGTAAACATCGAAGCGATAAGGCTAAGGCGTCACCCAGTTCAGCTCAACAACTTAACTTCAAGTCTGCCCATAAATTGGTGGGCCATGACCAGTATACTCGCAAAGGCATTACGCCTCCAGACGCTGACGGTTTCAGGCGCAAACCCATCAGGATCACAGCCCCGCAAGATGAGCGATGCCCTAAACGCCAAAGGTGACTGGGCTTTTCGTGACTCCCGTAATTCCCCTGCAAGGAGCTGTGGACAGACGTCACATGGAGACAGTACATCTTGGTGAGCCCTTGGCTCGCGTGACCGTATGGTCTAGAGAATTCACGCGTGTGCCAGACTCGCACGGTCAGTAGTCAGGTCCGTCAAATATCTGCGACCGAGTTTGGAGCAGAAGTTGGGGCACGCAAAGTGGGCCGGCATGTGAGTAATGAGGAAAGGACACTGATGTTGAGGGTGCAGCTATTTAAGAGATCTCCTTCGGAGTTATTGACCCGCTCTAGAGTACGGCGGGAATACCTCGAAGTCCCTATTGAGCCTTACATTAACGGATCTGTGCCAAAAGTTCGGACATGGGACATCCGGCGCTGGGCGCCCTCGTGTAATGCGCGTTGTAGGAATCCAGTGAAAGTGTATTCCATAGAATGTCGATAACAGTAAACCCCGCCGTCTACTCGTGAAACAGGACTTTATGCAATTTCTGCCATGGATGAGCGCTGGGTTAAGATAGCTATCTCGATAATCGAACGTTAATCCCCTGGTCAGACGGGTAACATCAATTTCTTCCCAAATACGCTACATTCTTCTACGTGTCGGCTTGGAAGGAGTCCTCTCGTTCAAAATGATATATCCAATGAGTTAACATCCATTTCCGCGGGCTGCGAAACCACCCGCTTTAATAAACTTGGTCTAGATATTCACCAAGCTTCTTACCAACACGCAAGTACTGCAATTTCGCGAACCCTGAATCTGATAGGTGGAACATATGGAGCCCTGACAATGCGATGATGGGGGGTCGACTAAGGGGCATATGTGCTTTCCAAGCACTGGGTAGTGCAACAAAGAATGCTAATACTCTGAGTGCGGGGTCGCGGACATCCTCCTGACAGGTACTCCGAGCGCCCGGTATTACTTGAAGACACACTAATCCGAAGAACCTGGCCATCTAATAATTGGCCGCTGTTGGCGGACCTTAGGCACACAGTTTCTCTGCTTCCCGAACGTACGAGAGCTTGCCTGAGACGCTAGTACCAAGGTGGAATATCACACTCTTGGAATGGGAAACTTGGCCTTTTAGCCCTCTCTGATGTCACAGAAGCCGATGAACGGCTACAGCAGATGTGTGATCATGTACCTTCAGCCTAGGATCGTCTAGGGCCTGAATCTAGCAATAGAACTTAGATAGGTGATGACTAAATCCGTACTTAGGTTCTAGGAAACCGTGTGACACTATGGGCCCACAGACAAGGGCACGATCAATAGAACCGGGATTTATCTACTTTGAGTTGCTCAACATCTACCAGATGTTAATCGGTTGTGGGATTCCATATCAGTTGGACTATTAGATGTCATGCAAAGAAATGGCGCCCGCGATAACCAGTTCCTAACACTGTTGACAGAGAACAAACTCTCCTTCGGGTTGCTATATTTCTAGAAAACAAGATTGTGCGAAGAACATGTGTGTATGTTGTGATATCCTGTCTGTAAGCAGACCTTAAATCATGCGCTTGCAGGTGCTCTACATCTTACGATGCGTTATGGACTTTCATTCGTTTAATTGTGCGCTGCCCGCTTTACTGATGGGGATGAAATTTAGTGCTGGCTTTAACACCCGAGGCAACTACGTATAGAGTAACATTTTACGAACGATAGGGTAACAACGCGCTGGAACGTTGGATCAATGAGCGCTGATCCGGGGCTAGTACTGGCGTATGAGACTTTACTCGAGGGCACGCGACACCGCTGCATCTACATGGGTCACTGATACATGTGATTACTTTAAAGTAGTGTAAACCGCTGGCATTCCTTCAGACTGGCCGGAATTCGACCCTCGTGGAGATCTGTCCTACAGGTCTCCAAAAATGGGGGTATTCTACCGATCAGAGGCCGCAAGCTATTCATGTATGGCGGCATTGGATATCCTAAATTCGTATCCAGCCGCTAACGAATGAGTCTTTCGCCGTTTTCCGTCTCAGATAATTGTCTTCCTGTAGTTAAATAAGCAATCCTTCTTACTACGGCCGCTCTAGCGATCGATGGGAGCCGGCCCCCCCGCGTGTTCATCACTCAACGTCAGCAGCGTAAGTTAGTAATGTTAGATGAGAGCTCTTCGTGTGATAACTATTATTATTTGCAAGTGCCAGGTC
131 8 7 GAGCAACCCAGCCTCAGAGAGGCGGCCACGGGAACGACGCCTAAAACTATCTGCGCTCCCGTCGGTCACACATCGAGTTAATAATAAGCTGTTACACTAGGATAGCGCAGTGCGCTTGTTTGGTGGTTTCAAAGAGGTTCCAGTCAGCCCACAGTCTGTCGCCGTCAAATTCTTTCCCACTATTTAAGGAGCTCATCGGGCACATGATGATGTAGCACGGAAGTTCTGTAGT
295 44 7 TTTGCGGTGTAAGCATGACTGACAGATGAGTTGGTACCGTATACCAAGTACTAGCGACTAAGGTCTCCACGGAACACTGCAAAGACCATAGCGCACGTAAGTGACATGTCCTGGGTGTCGAAGTACGCTACAATCCACGATGTAGTGTGATCCCCATCGATGCATTGGTCTAGCATGGTCATATAAATAGCAGCCCGAACCGGGGTTCCCATGTGACCGCGACTTCGTCTATGTAATGATGACTTGCTTCGAATAGATACGAAGCTGGACCGAGATCAGGTCGTCTGCTGGCAGGGAACCGGGAGCGTCGAGAGGGGATACCGCGCTTTATATAAAAAGCAGCGAGCTAAAGCGGGACGTTTGGGCATGTATGGGGCTGTTGTCGCTGTCTTGTCAAGTTTAAAATGTCGCGGGCTAGCCTGAGTTAAATTTCGACATAGCTGCATGGATTATTTATACTGTCCAATGGCCAAGAGACCATAAACAGTGGGGGCACCCTTCCGATGCCCTGAACGCTGGAAGATTAAACCGCAGTGTTAGAGGCACTGAGAGTTCTAGCTGATGCAGTGCGCACGGCGGCACGCCCCGACTTTGGCTATTGGAGACCCGAACCGGACAGCTCTTATCGCATTTAGCCAGCGTATGAGTGACCCAGAGGCATGATGTTGTTGCTAAAGACCAGTCACCTCCACTAGCCTGCACCGAGCTTATTTCGTCAGGCCCGGTGTGGAGAATATGAGCACCCCCCATCGCAAACTTCTGAGACTCGACCTACGGGCCCATCATTCAAAATCCTGGAGGAAGATCGAGGAGGGATCAACACGGTCGATAGCCGATCCATTAAAAGGATGAATATCAGCGTCACAGCCCTGATCCTGGATAGACACCACTTCGAATTATGTAAGTCGGGAGCGGCCGGAGCTCAGGCCGCCTTGCAGCTGCTCGAGAGCAAAGGCTAAGAGCCACTGTGCGAGCAGTCATTCCTGTAGTCTTCACCACCTTTAGGACCACGTGTTCGACGCTACAGGACTCGTATAACAGGGATCGCCGATACGGATTTGTTCCTGACTCTCAGTTAAACGATATCTCGGGGATATACAGTTAGGTACATCCTGCGTACGCTCTTCGCTTATTGCGGTTTTGACTGAGCACCGTAAGGAGTAAGGCAAAGGTTGCAACCGGCGAGTTAGGGTGGAGTCTTGATATTGTGCGCGGCGTATCACAACCTTAACGTCACGCACGGCGCCTTCGGTCACTCAATCTGTATTTACAGCCC
216 29 9 ACTAGTATTTAGCCGAGGTGATCATAGGTGATGCACCAGTGTATACATTGGTCGGTTGCTGGCCTCAAAACCAGCCTGCTGCCTGGACCCAAGTTGGTCTAGAAATTTTCGTAGCCGTACAAAGAAAACTATTAGTGTATGTGTCAACTCTAGGTAGGGGCCCCGGTTGCCCTGATTATAGTAATGTCCTTCATCAGAGTCATAACATAGTTCCCGAGGGTTCTGAGTAGACAGACGATGTATTACGTAGCTAAATTTAATTGCACGACGTCTATCATTCCCCCTTCCGTCAGCTTCCGTGAACTCCGGGTACCGAGAATTCGCTGACAAACTTAACGGCCAGGAGAAAGGCCCGTAGCCGGTTGTGGCGCCACAAACAAACCGCCCCCGCTGATTGTCATTAAGTGGCCGAGTTTAAACAGCTCGCCGTCGAATGGTATGCTGGCGCGAAATTTTGGTATTCATGACTCCTGAGTGCTAAGTAACCGGAATTCTTTGAGAAGTGCCGACATCATAAACCGGGATTTACAAGAACTGGACACAGTCCGCAGTAACGGTTACATGTCCTACACCTGAACGGAGGTGATGTCAACTACGCTTCACGATCCCAGAGTCTGAGAGCCCTAGAAGTAATACAGCACTCCGTGTTGATCAGACCTTCGAACAGGTTAATTTAGAGGTAGATCAGTACCCGAATAGGGCAAAACGACTCTAAGAGACTAGCATGACGGAATACTTGCTGAGCCACAATTTGTGGTCGAGGAATCACGTTTGGACCGATGCCTTCGCTCCCAGCGAGTCTAGAGGCTTCGCAGAAGACTTCTCAGCTCGGGGCGACGTAGAGCTCATTGGAAGTTCTGCCTGTGGCGCTTTCGCGTTTAGTACCAGCCGCGTAGGGTGGTCCCTAAGAATACATCCGTGGGTCGACCATGAATTGGGGTCGAATGCTGATACAGCGCACGAGAAAGTTGTTGTCGTCTGATACTTGTCATTCTTGTTTCAGTCTCCTGATCTAAGCAGACCTTCCGCCATTTCGAATCTATTCAGTTAATTAGTTCTCCTATAATGGGTAC
766 73 6 CGCATGTTTGTCGATTGTTAAAGCGGATGTATGTTGTTAGTTTCCTTAAAGTGGCTCCCGGAAGTACACCGACTGATCTCGATCAATCATAAACTGTAGAACAGCGCAGGCCCCTTCAGCATGTTTGCAGCGGGACGAGGCATCTGTTTCACTACGCCAACTTCCGTCGGTTTTATTCTTCTCACGCTGGCGTCCATCCTGACGCAGTTCTGATAAGAGCTATGCGTTATTACATACCAGCCGGCGAGAGTTGGGCAGTAGAGAGAACCTTCGGAAGCTCCTTCCTGACCGGGGCAATTCTTTGCGCATAGTCATTCCGTTATCAACTTCGCTAACCCAATCAGCTTGCGGGCAGAACGACCGGCAAGCCTTCGCGTTGGTAAGGGTTCTAATGATGTATAATTAAGTCCCAGCCTGTTGGTTACTCAGAATTGTAAACATGTGTCGCGTAGTCAGCTCATGGCTGCACATACGGTTCCGTTCATTTCGGGCTGAAAGACGGGGCCTCTTCTAAGCTTATGCACTTTGAGCTACGACTGTACCGAACGGAATTACCAGTATTCCGGACCCATGCGTAATCTCCACCGGATAATGATCTTGCATGACCGCCTGTGGATTAGGAAACGGCTAAAACAATGCTGTGAGTCGTCCACCCAGTCCGCATAAGGCATCCAGAATTAAGGGCTGTATTGTTGGCATTTACGCAATTCACCTGATACTACGAGTGGGAGACCGGGGCGTACGTCTCCAGGATTATTCTAACTACGTGCATTAAGATAAGTGCTCGCAGATGACCTCGTAGGTGTGGTTTCCGTTGTAAACCGAATGGGATCCCATAATGCGCAATTCGGTTACCAACATGACGGGGATACCCTATGCGAATCAGCCAAGTCGGATATCGCCGCGGCATAACTCCATCGTCGGGATATGTCTCATCCGAACAAGTCAAATCTCTCCGCGCCCTGTATCAATCCGTTCCTAACGAGTCGTTTTTACTCAGCCAATCTTCAAATGACAGGACTCATGTAATTAGCCAACCTACGGGGGGTTTCATATTTCGCTAATTTTGCCAGGGGCAGGAGGATAACTTACAGAACTATCGGTGTAACCAATTACAATTACCTGCGCCCTAAACTGCTGCGACGGACTGTATCTTCGGGGAATTGCTTATGAGAACTCTGTATCGACAGTATTTCAAGCACTAGCTTGCCCCGATACCAGGTGAATAGAACAGAGGTCAATACATTCCTTCAACCTAGTACGCTCAAATTAAATTTCGGAACATCCCTGTGGTATGCTTCGTTTCACCAATGAAAAGGTACAGAATGGTAAACATCGCTGCAGAAACTACCAATGAAACTTTTTATTCATAAAGCGGTGACTGGCCGTGTGAGCGACTGCGGCGCACCGATAGACCAAGCACGATAATACGACAAATTCCAGGCACCGAACACTAACCACCGATAGGATTGCTCCGCGCGCACTTTTGGAAGTGTCTATACATTCTTGATGAATCCGACACTAGCCGCGCTTACTGGAATTTGATCTTGCTCCATCGGGGCCTGTGGTAACTGGTGGATTCTCAGGTACCCCTAGTCAGGCTCCTGAATTTATGAAATGTGTCTGTCGATTTTCGGTGGTTCTAACAGCGAGCAAGACTTGGCCATTGGGCTGGCGGCAATAAAAAAGGGACGTGAGTTACCAGCGGGCTGCGGGCCCTACGCAAAGGCCAGGGTTAGCGAGACTCGTGTTACAAAATGGGGACTGTTCTGTTGGATAGTGCCAGACATAGGCGATCGAGTTAATCATACTTACAGGACCAAAA
594 94 3 GACGCGACAACTCAGGATCACGACATCCCAAATAATGTCACCTCAAAAAATTTTGACCTGTTGGACGCACATTATAGTGTTTCGCTATCGTCCCGTTTGGCCCTGAAAATCTACACTATATGAATCCCGGAAGCCCGAACGATAGGTCAA
Caveat
The solution to this assignment should be a C program necessarily structured as a main function.
The problem
Suppose a DNA strand consists of a sequence of genes. Each gene is a sequence of four types
of nucleotides: Adenine (A), Thymine (T), Cytosine (C), Guanine (G). Moreover, a gene has
two well defined, adjacent regions: a coding region called exon, located in the beginning of the
gene, and a non-coding region called intron, located in the immediate tail of the gene. If a gene
contains h nucleotides in the exon region, then the length t of the intron region (i.e., the number
of nucleotides in it) can be determined as follows:
t = 3h + 1
For instance, Figure 1 shows a DNA strand of a fictitious organism. It contains 2 genes and each
gene has 3 nucleotides in the exon region (shown underlined in the figure).
GATAD{GAATGCC
First Gene
CCTCGTAGTTGAC
Second Gene
In view of the above, your C main function should:
1. scan a data file containing samples of DNA strands of (fictitious) organisms: in this data
file, for each sample there is a line of data with the following fields separated by blanks -
sample ID (integer), number of genes in the DNA strand (integer), length of the exon portion
of each gene (integer), a sequence of characters representing the DNA strand. Note: you
should NOT open a le to read data from it using C commands fopen and fscanf. Instead,
you should set up your C development environment (Quincy) to take a data file as the source of input, and use the C function scanf to read the data. If you do not know how to do that, you should consult the Discussion Board, your TA, or your Instructor.
2. determine the type of the DNA strand: the DNA type depends on the mean number and
type of nucleotides in the exon regions of the genes present in the DNA strand, as shown in
Table 1:
DNA TYPE | CRITERIA
1 | C<A<T<G
2 | A<C<,G>0
3 | other wise
*The underlines on the letters means the mean(average), while on the < means equal to in this case greater and equal to. (I had to make the table since it couldn't copy properly from the page I was viewing it in.
Table 1: DNA types and criteria. In the Criteria column, A; T;C; and G denote the mean number of the respective nucleotides present in the exon regions of a DNA strand (i.e., A denotes the mean number of Adenine in the exon regions of the strand).
3. produce separate printouts for all the samples contained in each data file: each printout
will present for each strand, on separate lines, the ID number of the strand, the values of
A; T; G; and C, and the type of the strand. It should also include the name of the data file.
The data should be labeled and formatted exactly as shown in Figure 2.
Data File: dataFile01.txt
ID mA mT mG mC Type
684 0.27 0.30 0.20 0.24 3
465 0.26 0.25 0.24 0.24 3
131 0.21 0.21 0.30 0.27 3
* I typed up 3 lines for time's sake, there was about 7, this is not the data being read, instead this is an example how I should make my program print out the data.
*the numbers integers that have 3 digits are the sample numbers, the decimal numbers are the mean values of each nucleotide and the single integer number(all 3s) are the type numbers. Sorry guys the columns are not aligned properly.
* What I'm having trouble with is, how do I know which parts are A T G and C on the DNA sequence, because if I don't know this I can't perform an operation to get their mean values to assign a type. Also how do I write a program that knows when its done reading one line of a DNA sequence den moves onto the next as a brand new one;how do I print out my values like in figure 2 to and how do i use prinft and scanf to read and write my data ( is this redirection?). Thanks guys, if I know some idea of this I can start on it, right now I'm having a big programmers block in my head preventing me from even starting this, I'm stuck on int main void.. I need help guys thanks. An algorithm would be nice(not the C code its self). I will be updating you guys on the code I write, hopefully have it done by Monday night (June 1st). Thanks a lot guys I appreciate your help.
*EDIT May 31 2009 11:56pm (EST)
I found a file that came with the assignment I overlooked, maybe you guys can help me better by looking at this file. My program basically has to read this file in.
684 30 8 ATCAGAGGGGATTTCGCCGGTCTATCGAAGCTGAATTCATGTATTAACGATTACAACGAATCAGTAAAGAGTCCTTAGACGGTGATACAGCACGGTGGGTCGTGCTTTAGCCTTTTGCTTTCACTGTTCTAGTGATAATGAGGCTCGAAACTCCTGACCATATAAATGAGTACATAGGGACCCAAGGATAGCTATTCTTATTTACATGTATACGCACTCTCCACCTGCAAAGTCCTTTAGCAGATCCCCCACATGTCTCTATTAACCTAGTATATCCGCTTTTCATGGCTGATCCAACGTAAGGATCTCAGTCGCTCTGGGGTAGAAGTCGCCAATGGGCGTAAACGTAATTTGTTCCGGATTCATATTAACGTAATATAGCAACCTCCGAAACACAATGCGTGAGATTACCTATGTGCTTAACTCTATTTACATCGTGAGAGTTCCGGCAGTTAAGACAGCCCTCTAGTGGAAGGGGCTCCACCACAAATTTGTCTCCGCTTGAGAGAAATTGGATCGACCGTCCGTGAGGACCCGCCGCTGTTCACAGCCAAAGTAAAATGGTATAAAACCGGCGGTATCACTCAAACTTGCCCCATCATCTAAATGAGGCGATAGAATAGGCCTCACTCCTTTTTCGGGCACCCATGAACCCCCACGCCGTACTTACTCGCGAAGTCCCAGTAATTAGACAGCCGTGGGAGATCGTGAGGTCTAAGCGCCCGCACTCAACCATGGGACTGCGAAAGATAGAATAATCTGACATCCGAGAAGTTCCTCGATCCGAAGACGAGAAAGTTCTCAAGCGGCTACCGAACCTTCCTCTGCTGGACAGGTGCTGCGGTCCCGAAGTTAGCCCGTCTCATGAAAAACCAAACGCCTTGCTTTCAGTTATTAACGTCATCTGACAACCCGAAGCATTAGGTGAGAACGCCCCGGCGCCTTGCGCCGGTCCTGATGTTCTGCTCAATCCCCGTAACCTGCGAGGCC
465 87 7 GTGTAGTTGCTCCGACAGACCCGAAACCCCCAACTTTCTGGAACTTTTCTGTATAGCAGGGCCTAGATATTCCGAAATCTATGCTGCCCTCTCTCCATGATTCGCGTTAGCACTTTACATCTACCCCAATAGAATGACATTGGGCTCTACCCTCACGGGTCACATAGCGGGTAAACATCGAAGCGATAAGGCTAAGGCGTCACCCAGTTCAGCTCAACAACTTAACTTCAAGTCTGCCCATAAATTGGTGGGCCATGACCAGTATACTCGCAAAGGCATTACGCCTCCAGACGCTGACGGTTTCAGGCGCAAACCCATCAGGATCACAGCCCCGCAAGATGAGCGATGCCCTAAACGCCAAAGGTGACTGGGCTTTTCGTGACTCCCGTAATTCCCCTGCAAGGAGCTGTGGACAGACGTCACATGGAGACAGTACATCTTGGTGAGCCCTTGGCTCGCGTGACCGTATGGTCTAGAGAATTCACGCGTGTGCCAGACTCGCACGGTCAGTAGTCAGGTCCGTCAAATATCTGCGACCGAGTTTGGAGCAGAAGTTGGGGCACGCAAAGTGGGCCGGCATGTGAGTAATGAGGAAAGGACACTGATGTTGAGGGTGCAGCTATTTAAGAGATCTCCTTCGGAGTTATTGACCCGCTCTAGAGTACGGCGGGAATACCTCGAAGTCCCTATTGAGCCTTACATTAACGGATCTGTGCCAAAAGTTCGGACATGGGACATCCGGCGCTGGGCGCCCTCGTGTAATGCGCGTTGTAGGAATCCAGTGAAAGTGTATTCCATAGAATGTCGATAACAGTAAACCCCGCCGTCTACTCGTGAAACAGGACTTTATGCAATTTCTGCCATGGATGAGCGCTGGGTTAAGATAGCTATCTCGATAATCGAACGTTAATCCCCTGGTCAGACGGGTAACATCAATTTCTTCCCAAATACGCTACATTCTTCTACGTGTCGGCTTGGAAGGAGTCCTCTCGTTCAAAATGATATATCCAATGAGTTAACATCCATTTCCGCGGGCTGCGAAACCACCCGCTTTAATAAACTTGGTCTAGATATTCACCAAGCTTCTTACCAACACGCAAGTACTGCAATTTCGCGAACCCTGAATCTGATAGGTGGAACATATGGAGCCCTGACAATGCGATGATGGGGGGTCGACTAAGGGGCATATGTGCTTTCCAAGCACTGGGTAGTGCAACAAAGAATGCTAATACTCTGAGTGCGGGGTCGCGGACATCCTCCTGACAGGTACTCCGAGCGCCCGGTATTACTTGAAGACACACTAATCCGAAGAACCTGGCCATCTAATAATTGGCCGCTGTTGGCGGACCTTAGGCACACAGTTTCTCTGCTTCCCGAACGTACGAGAGCTTGCCTGAGACGCTAGTACCAAGGTGGAATATCACACTCTTGGAATGGGAAACTTGGCCTTTTAGCCCTCTCTGATGTCACAGAAGCCGATGAACGGCTACAGCAGATGTGTGATCATGTACCTTCAGCCTAGGATCGTCTAGGGCCTGAATCTAGCAATAGAACTTAGATAGGTGATGACTAAATCCGTACTTAGGTTCTAGGAAACCGTGTGACACTATGGGCCCACAGACAAGGGCACGATCAATAGAACCGGGATTTATCTACTTTGAGTTGCTCAACATCTACCAGATGTTAATCGGTTGTGGGATTCCATATCAGTTGGACTATTAGATGTCATGCAAAGAAATGGCGCCCGCGATAACCAGTTCCTAACACTGTTGACAGAGAACAAACTCTCCTTCGGGTTGCTATATTTCTAGAAAACAAGATTGTGCGAAGAACATGTGTGTATGTTGTGATATCCTGTCTGTAAGCAGACCTTAAATCATGCGCTTGCAGGTGCTCTACATCTTACGATGCGTTATGGACTTTCATTCGTTTAATTGTGCGCTGCCCGCTTTACTGATGGGGATGAAATTTAGTGCTGGCTTTAACACCCGAGGCAACTACGTATAGAGTAACATTTTACGAACGATAGGGTAACAACGCGCTGGAACGTTGGATCAATGAGCGCTGATCCGGGGCTAGTACTGGCGTATGAGACTTTACTCGAGGGCACGCGACACCGCTGCATCTACATGGGTCACTGATACATGTGATTACTTTAAAGTAGTGTAAACCGCTGGCATTCCTTCAGACTGGCCGGAATTCGACCCTCGTGGAGATCTGTCCTACAGGTCTCCAAAAATGGGGGTATTCTACCGATCAGAGGCCGCAAGCTATTCATGTATGGCGGCATTGGATATCCTAAATTCGTATCCAGCCGCTAACGAATGAGTCTTTCGCCGTTTTCCGTCTCAGATAATTGTCTTCCTGTAGTTAAATAAGCAATCCTTCTTACTACGGCCGCTCTAGCGATCGATGGGAGCCGGCCCCCCCGCGTGTTCATCACTCAACGTCAGCAGCGTAAGTTAGTAATGTTAGATGAGAGCTCTTCGTGTGATAACTATTATTATTTGCAAGTGCCAGGTC
131 8 7 GAGCAACCCAGCCTCAGAGAGGCGGCCACGGGAACGACGCCTAAAACTATCTGCGCTCCCGTCGGTCACACATCGAGTTAATAATAAGCTGTTACACTAGGATAGCGCAGTGCGCTTGTTTGGTGGTTTCAAAGAGGTTCCAGTCAGCCCACAGTCTGTCGCCGTCAAATTCTTTCCCACTATTTAAGGAGCTCATCGGGCACATGATGATGTAGCACGGAAGTTCTGTAGT
295 44 7 TTTGCGGTGTAAGCATGACTGACAGATGAGTTGGTACCGTATACCAAGTACTAGCGACTAAGGTCTCCACGGAACACTGCAAAGACCATAGCGCACGTAAGTGACATGTCCTGGGTGTCGAAGTACGCTACAATCCACGATGTAGTGTGATCCCCATCGATGCATTGGTCTAGCATGGTCATATAAATAGCAGCCCGAACCGGGGTTCCCATGTGACCGCGACTTCGTCTATGTAATGATGACTTGCTTCGAATAGATACGAAGCTGGACCGAGATCAGGTCGTCTGCTGGCAGGGAACCGGGAGCGTCGAGAGGGGATACCGCGCTTTATATAAAAAGCAGCGAGCTAAAGCGGGACGTTTGGGCATGTATGGGGCTGTTGTCGCTGTCTTGTCAAGTTTAAAATGTCGCGGGCTAGCCTGAGTTAAATTTCGACATAGCTGCATGGATTATTTATACTGTCCAATGGCCAAGAGACCATAAACAGTGGGGGCACCCTTCCGATGCCCTGAACGCTGGAAGATTAAACCGCAGTGTTAGAGGCACTGAGAGTTCTAGCTGATGCAGTGCGCACGGCGGCACGCCCCGACTTTGGCTATTGGAGACCCGAACCGGACAGCTCTTATCGCATTTAGCCAGCGTATGAGTGACCCAGAGGCATGATGTTGTTGCTAAAGACCAGTCACCTCCACTAGCCTGCACCGAGCTTATTTCGTCAGGCCCGGTGTGGAGAATATGAGCACCCCCCATCGCAAACTTCTGAGACTCGACCTACGGGCCCATCATTCAAAATCCTGGAGGAAGATCGAGGAGGGATCAACACGGTCGATAGCCGATCCATTAAAAGGATGAATATCAGCGTCACAGCCCTGATCCTGGATAGACACCACTTCGAATTATGTAAGTCGGGAGCGGCCGGAGCTCAGGCCGCCTTGCAGCTGCTCGAGAGCAAAGGCTAAGAGCCACTGTGCGAGCAGTCATTCCTGTAGTCTTCACCACCTTTAGGACCACGTGTTCGACGCTACAGGACTCGTATAACAGGGATCGCCGATACGGATTTGTTCCTGACTCTCAGTTAAACGATATCTCGGGGATATACAGTTAGGTACATCCTGCGTACGCTCTTCGCTTATTGCGGTTTTGACTGAGCACCGTAAGGAGTAAGGCAAAGGTTGCAACCGGCGAGTTAGGGTGGAGTCTTGATATTGTGCGCGGCGTATCACAACCTTAACGTCACGCACGGCGCCTTCGGTCACTCAATCTGTATTTACAGCCC
216 29 9 ACTAGTATTTAGCCGAGGTGATCATAGGTGATGCACCAGTGTATACATTGGTCGGTTGCTGGCCTCAAAACCAGCCTGCTGCCTGGACCCAAGTTGGTCTAGAAATTTTCGTAGCCGTACAAAGAAAACTATTAGTGTATGTGTCAACTCTAGGTAGGGGCCCCGGTTGCCCTGATTATAGTAATGTCCTTCATCAGAGTCATAACATAGTTCCCGAGGGTTCTGAGTAGACAGACGATGTATTACGTAGCTAAATTTAATTGCACGACGTCTATCATTCCCCCTTCCGTCAGCTTCCGTGAACTCCGGGTACCGAGAATTCGCTGACAAACTTAACGGCCAGGAGAAAGGCCCGTAGCCGGTTGTGGCGCCACAAACAAACCGCCCCCGCTGATTGTCATTAAGTGGCCGAGTTTAAACAGCTCGCCGTCGAATGGTATGCTGGCGCGAAATTTTGGTATTCATGACTCCTGAGTGCTAAGTAACCGGAATTCTTTGAGAAGTGCCGACATCATAAACCGGGATTTACAAGAACTGGACACAGTCCGCAGTAACGGTTACATGTCCTACACCTGAACGGAGGTGATGTCAACTACGCTTCACGATCCCAGAGTCTGAGAGCCCTAGAAGTAATACAGCACTCCGTGTTGATCAGACCTTCGAACAGGTTAATTTAGAGGTAGATCAGTACCCGAATAGGGCAAAACGACTCTAAGAGACTAGCATGACGGAATACTTGCTGAGCCACAATTTGTGGTCGAGGAATCACGTTTGGACCGATGCCTTCGCTCCCAGCGAGTCTAGAGGCTTCGCAGAAGACTTCTCAGCTCGGGGCGACGTAGAGCTCATTGGAAGTTCTGCCTGTGGCGCTTTCGCGTTTAGTACCAGCCGCGTAGGGTGGTCCCTAAGAATACATCCGTGGGTCGACCATGAATTGGGGTCGAATGCTGATACAGCGCACGAGAAAGTTGTTGTCGTCTGATACTTGTCATTCTTGTTTCAGTCTCCTGATCTAAGCAGACCTTCCGCCATTTCGAATCTATTCAGTTAATTAGTTCTCCTATAATGGGTAC
766 73 6 CGCATGTTTGTCGATTGTTAAAGCGGATGTATGTTGTTAGTTTCCTTAAAGTGGCTCCCGGAAGTACACCGACTGATCTCGATCAATCATAAACTGTAGAACAGCGCAGGCCCCTTCAGCATGTTTGCAGCGGGACGAGGCATCTGTTTCACTACGCCAACTTCCGTCGGTTTTATTCTTCTCACGCTGGCGTCCATCCTGACGCAGTTCTGATAAGAGCTATGCGTTATTACATACCAGCCGGCGAGAGTTGGGCAGTAGAGAGAACCTTCGGAAGCTCCTTCCTGACCGGGGCAATTCTTTGCGCATAGTCATTCCGTTATCAACTTCGCTAACCCAATCAGCTTGCGGGCAGAACGACCGGCAAGCCTTCGCGTTGGTAAGGGTTCTAATGATGTATAATTAAGTCCCAGCCTGTTGGTTACTCAGAATTGTAAACATGTGTCGCGTAGTCAGCTCATGGCTGCACATACGGTTCCGTTCATTTCGGGCTGAAAGACGGGGCCTCTTCTAAGCTTATGCACTTTGAGCTACGACTGTACCGAACGGAATTACCAGTATTCCGGACCCATGCGTAATCTCCACCGGATAATGATCTTGCATGACCGCCTGTGGATTAGGAAACGGCTAAAACAATGCTGTGAGTCGTCCACCCAGTCCGCATAAGGCATCCAGAATTAAGGGCTGTATTGTTGGCATTTACGCAATTCACCTGATACTACGAGTGGGAGACCGGGGCGTACGTCTCCAGGATTATTCTAACTACGTGCATTAAGATAAGTGCTCGCAGATGACCTCGTAGGTGTGGTTTCCGTTGTAAACCGAATGGGATCCCATAATGCGCAATTCGGTTACCAACATGACGGGGATACCCTATGCGAATCAGCCAAGTCGGATATCGCCGCGGCATAACTCCATCGTCGGGATATGTCTCATCCGAACAAGTCAAATCTCTCCGCGCCCTGTATCAATCCGTTCCTAACGAGTCGTTTTTACTCAGCCAATCTTCAAATGACAGGACTCATGTAATTAGCCAACCTACGGGGGGTTTCATATTTCGCTAATTTTGCCAGGGGCAGGAGGATAACTTACAGAACTATCGGTGTAACCAATTACAATTACCTGCGCCCTAAACTGCTGCGACGGACTGTATCTTCGGGGAATTGCTTATGAGAACTCTGTATCGACAGTATTTCAAGCACTAGCTTGCCCCGATACCAGGTGAATAGAACAGAGGTCAATACATTCCTTCAACCTAGTACGCTCAAATTAAATTTCGGAACATCCCTGTGGTATGCTTCGTTTCACCAATGAAAAGGTACAGAATGGTAAACATCGCTGCAGAAACTACCAATGAAACTTTTTATTCATAAAGCGGTGACTGGCCGTGTGAGCGACTGCGGCGCACCGATAGACCAAGCACGATAATACGACAAATTCCAGGCACCGAACACTAACCACCGATAGGATTGCTCCGCGCGCACTTTTGGAAGTGTCTATACATTCTTGATGAATCCGACACTAGCCGCGCTTACTGGAATTTGATCTTGCTCCATCGGGGCCTGTGGTAACTGGTGGATTCTCAGGTACCCCTAGTCAGGCTCCTGAATTTATGAAATGTGTCTGTCGATTTTCGGTGGTTCTAACAGCGAGCAAGACTTGGCCATTGGGCTGGCGGCAATAAAAAAGGGACGTGAGTTACCAGCGGGCTGCGGGCCCTACGCAAAGGCCAGGGTTAGCGAGACTCGTGTTACAAAATGGGGACTGTTCTGTTGGATAGTGCCAGACATAGGCGATCGAGTTAATCATACTTACAGGACCAAAA
594 94 3 GACGCGACAACTCAGGATCACGACATCCCAAATAATGTCACCTCAAAAAATTTTGACCTGTTGGACGCACATTATAGTGTTTCGCTATCGTCCCGTTTGGCCCTGAAAATCTACACTATATGAATCCCGGAAGCCCGAACGATAGGTCAA
Last edited: