Neighborhood Mining in Biological Networks

Stenersen, Kristoffer; Sundsdal, Sverre

dc.contributor.advisor	Hetland, Magnus Lie	nb_NO
dc.contributor.author	Stenersen, Kristoffer	nb_NO
dc.contributor.author	Sundsdal, Sverre	nb_NO
dc.date.accessioned	2014-12-19T13:34:23Z
dc.date.available	2014-12-19T13:34:23Z
dc.date.created	2010-09-05	nb_NO
dc.date.issued	2006	nb_NO
dc.identifier	349035	nb_NO
dc.identifier	ntnudaim:1468	nb_NO
dc.identifier.uri	http://hdl.handle.net/11250/251506
dc.description.abstract	Biologists are constantly looking for new knowledge about biological properties and processes. Bio-molecular interaction networks model dependencies among proteins and the processes they participate. By studying patterns of interaction in these networks, it may be possible to discover implicit information embedded in the network topology. In this thesis we improve existing and develop new methods for investigating similarities between proteins, and for discovering protein interaction sub-patterns. Cytoscape (Shannon et al., 2003) is a tool for visualization and analysis of interaction networks used by biologists. We have developed an extension to Cytoscape that lets biologists perform the following tasks: - Compare proteins based on neighborhood information - Find interaction sub pattern in an interaction network. - Discover sub patterns in one or several networks. Our main contributions are improvements to graph mining algorithms gSpan by Yan and Han (2002) and Apriori by Inokuchi et al. (2003) whose original task was the discovering of frequent sub-patterns in a very large set of networks. We have enabled mining a single network and enabled less exact matches. The graph mining algorithm runs on labeled graphs, and we have used various clustering techniques for this task. The clustering is done through similarity measures between proteins, which we have based on Gene Ontology annotations and experimental data obtained from a ChIP-chip experiment. Our plug-in may easily be extended by adding other cluster techniques or similarity measures. We verify the results of our implementations and test them for speed. We find that of the two mining algorithms gSpan shows the most promise for mining biological graphs.	nb_NO
dc.language	eng	nb_NO
dc.publisher	Institutt for datateknikk og informasjonsvitenskap	nb_NO
dc.subject	ntnudaim	no_NO
dc.subject	SIF2 datateknikk	no_NO
dc.subject	Komplekse datasystemer	no_NO
dc.title	Neighborhood Mining in Biological Networks	nb_NO
dc.type	Master thesis	nb_NO
dc.source.pagenumber	107	nb_NO
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap	nb_NO

Tilhørende fil(er)

Filnavn:: 349035_COVER01.pdf
Størrelse:: 47.53Kb
Format:: PDF

Åpne

Filnavn:: 349035_FULLTEXT01.pdf
Størrelse:: 1.631Mb
Format:: PDF

Åpne

Filnavn:: 349035_ATTACHMENT01.zip
Størrelse:: 22.62Mb
Format:: Ukjent

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6765]

Vis enkel innførsel