• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Øvrige samlinger
  • Publikasjoner fra CRIStin - NTNU
  • View Item
  •   Home
  • Øvrige samlinger
  • Publikasjoner fra CRIStin - NTNU
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Motif kernel generated by genetic programming improves remote homology and fold detection

Håndstad, Tony; Hestnes, Arne Johan Husebø; Sætrom, Pål
Journal article, Peer reviewed
Thumbnail
View/Open
1471-2105-8-23.pdf (1.903Mb)
URI
http://hdl.handle.net/11250/2366770
Date
2007
Metadata
Show full item record
Collections
  • Institutt for datateknologi og informatikk [3771]
  • Publikasjoner fra CRIStin - NTNU [19694]
Original version
BMC Bioinformatics 2007, 8   10.1186/1471-2105-8-23
Abstract
Background: Protein remote homology detection is a central problem in computational biology.

Most recent methods train support vector machines to discriminate between related and unrelated

sequences and these studies have introduced several types of kernels. One successful approach is

to base a kernel on shared occurrences of discrete sequence motifs. Still, many protein sequences

fail to be classified correctly for a lack of a suitable set of motifs for these sequences.

Results: We introduce the GPkernel, which is a motif kernel based on discrete sequence motifs

where the motifs are evolved using genetic programming. All proteins can be grouped according

to evolutionary relations and structure, and the method uses this inherent structure to create

groups of motifs that discriminate between different families of evolutionary origin. When tested

on two SCOP benchmarks, the superfamily and fold recognition problems, the GPkernel gives

significantly better results compared to related methods of remote homology detection.

Conclusion: The GPkernel gives particularly good results on the more difficult fold recognition

problem compared to the other methods. This is mainly because the method creates motif sets

that describe similarities among subgroups of both the related and unrelated proteins. This rich set

of motifs give a better description of the similarities and differences between different folds than

do previous motif-based methods.
Publisher
BioMed Central
Journal
BMC Bioinformatics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit