The Triform algorithm: improved sensitivity and specificity in ChIP-Seq peak finding
Journal article, Peer reviewed
Permanent lenke
http://hdl.handle.net/11250/2366730Utgivelsesdato
2012Metadata
Vis full innførselSamlinger
Sammendrag
Background: Chromatin immunoprecipitation combined with high-throughput sequencing (ChIP-Seq) is the most
frequently used method to identify the binding sites of transcription factors. Active binding sites can be seen as
peaks in enrichment profiles when the sequencing reads are mapped to a reference genome. However, the profiles
are normally noisy, making it challenging to identify all significantly enriched regions in a reliable way and with an
acceptable false discovery rate.
Results: We present the Triform algorithm, an improved approach to automatic peak finding in ChIP-Seq
enrichment profiles for transcription factors. The method uses model-free statistics to identify peak-like distributions
of sequencing reads, taking advantage of improved peak definition in combination with known characteristics of
ChIP-Seq data.
Conclusions: Triform outperforms several existing methods in the identification of representative peak profiles in
curated benchmark data sets. We also show that Triform in many cases is able to identify peaks that are more
consistent with biological function, compared with other methods. Finally, we show that Triform can be used to
generate novel information on transcription factor binding in repeat regions, which represents a particular
challenge in many ChIP-Seq experiments. The Triform algorithm has been implemented in R, and is available via
http://tare.medisin.ntnu.no/triform.
Keywords: ChIP-Seq, Peak finding, Benchmark, Repeats