Feature selection in Medline using text and data mining techniques
Master thesis
Permanent lenke
http://hdl.handle.net/11250/250996Utgivelsesdato
2005Metadata
Vis full innførselSamlinger
Sammendrag
In this thesis we propose a new method for searching for gene products gene products and give annotations associating genes with Gene Ontology codes. Many solutions already exists, using different techniques, however few are capable of addressing the whole GO hierarchy. We propose a method for exploring this hierarchy by dividing it into subtrees, trying to find terms that are characteristics for the subtrees involved. Using a feature selection based on chi-square analysis and naive Bayes classification to find the correct GO nodes.