Vis enkel innførsel

dc.contributor.advisorHalaas, Arnenb_NO
dc.contributor.advisorSandve, Geir Kjetilnb_NO
dc.contributor.advisorDrabløs, Finnnb_NO
dc.contributor.authorWalseng, Vegardnb_NO
dc.date.accessioned2014-12-19T13:30:57Z
dc.date.available2014-12-19T13:30:57Z
dc.date.created2010-09-02nb_NO
dc.date.issued2006nb_NO
dc.identifier346796nb_NO
dc.identifierntnudaim:1353nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/250167
dc.description.abstractThe aim of this thesis is twofold. Firstly, it is a survey of some of the most prevalent pattern models used in motif discovery algorithms. The main goal of the survey is to see how well these models with all their structural differences and varying levels of complexity and flexibility are able to actually represent binding site motifs. This is done in an attempt to map the advantages and disadvantages of applying a given pattern model to motif discovery tasks, and to see whether any of the models separates itself from the rest (either positively or negatively). To get fair results, the models are placed within a framework, which uses an exhaustive search to find best-case patterns from each model, and these are then compared to see if differences can be found in the models ability to a) separate motif instances from background (separation) b) predicate previously unknown motif instances (prediction) However, such exhaustive searching usually takes very long time, and it becomes necessary to find ways to speed up the process. Thus, the second objective of the thesis is to optimize the search for all three pattern models so that they are able to find the optimal pattern of a pattern model within a reasonable timeframe. Regarding the first goal, it seems clear that, if it is able to be trained correctly, the PWM outshines both the mismatch expression and the IUPAC string. Both the separation- and the prediction-performance of the PWM were quite good, even if the basic PWM algorithm basically just generates a profile of all the instances in the positive set. Regarding the second goal, the final algorithms of the mismatch expression and the IUPAC string were both able to find optimal patterns very quickly, even for very large values of m. There was not much point in finding ways to speed up the PWM algorithm as its running time was so fast anyway.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaimno_NO
dc.subjectSIF2 datateknikkno_NO
dc.subjectKomplekse datasystemerno_NO
dc.titleLearning pattern models from examplesnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber81nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel