Statistical Analysis of Darwinian Selection in Genome Evolution
Abstract
This work will try to prove that nearby genes in a genome have similar functions and are parts of the same biological processes, and are thus biased to stay together in evolution in order for the organism to survive. From a statistical perspective, we would think that organisms with a common ancestor evolve through random mutations, i.e. insertions, deletions and inversions. The different organisms will thus have their genes shuffled around randomly and independent of each other, such that their common synteny blocks are split. Thus, after a certain number of evolutionary steps, we should not be able to see any pattern when comparing organisms with a common ancestor. If we can find such patterns that are statistically significant, it suggests that we have Darwinian selection in genome evolution. This means that genes with similar functions stay together in evolution because they need each other, while the individuals with mutations in which the genes do not stay together will die because they lose the ability to perform the processes needed. The analysis is done through the use of sequence analysis, BLASTing, microarray expression data and gene ontology comparison of Escherichia coli and some of its neighbours.