Three investigations of piRNA: Making sense of a new class of molecules
MetadataShow full item record
This master s thesis is about Piwi-interacting RNA (piRNA) and its object is threefold. We will investigate i) how well a Support Vector Machine (SVM)-based method compares to the currently used software for piRNA cluster detection, namely proTRAC, ii) whether piRNA uses a seed-sequence to connect to protein-coding genes and iii) if piRNA regulates any specific types of protein-coding genes.The results were, respectively i) poorly, as the SVM method was not able to achieve a high level of specificity and sensitivity at the same time. As for ii) by investigating whether potential piRNA seed sequences hit genomic locations depleted for single-nucleotide polymorphisms (SNPs), we found no region of piRNA that seemed more important for binding. Lastly, for iii) by searching the genes hit most often by piRNA for category enrichment, we found no types of protein-coding genes to be preferentially controlled by piRNA.Result i) might be important as far as it shows that a mature machine learning method, the SVM, using a feature vector of attributes known to characterize piRNA, does not measure up to the currently used method for finding piRNA clusters. This indicates that refining proTRAC or similar methods might be a more fruitful avenue for improving piRNA cluster detection. Result ii) is important as the functioning of the still mysterious piRNA is better understood; by knowing how it might bind (using the whole sequence), the functions and regulatory targets of piRNA may become easier to find. The last result iii) likely tells little beyond that our method of finding classes of piRNA-regulated genes was lacking; one reason might be that different clusters of piRNA regulate different genes.