Iterative sub-network component analysis enables reconstruction of large scale genetic networks
Peer reviewed, Journal article
Published version
Åpne
Permanent lenke
https://hdl.handle.net/11250/2657805Utgivelsesdato
2015Metadata
Vis full innførselSamlinger
Originalversjon
10.1186/s12859-015-0768-9Sammendrag
Background:Network component analysis (NCA) became a popular tool to understand complex regulatorynetworks. The method uses high-throughput gene expression data and a priori topology to reconstruct transcriptionfactor activity profiles. Current NCA algorithms are constrained by several conditions posed on the network topology,to guarantee unique reconstruction (termed compliancy). However, the restrictions these conditions pose are notnecessarily true from biological perspective and they force network size reduction, pruning potentially importantcomponents.Results:To address this, we developed a novel, Iterative Sub-Network Component Analysis (ISNCA) for reconstructingnetworks at any size. By dividing the initial network into smaller, compliant subnetworks, the algorithm first predictsthe reconstruction of each subntework using standard NCA algorithms. It then subtracts from the reconstruction thecontribution of the shared components from the other subnetwork. We tested the ISNCA on real, large datasets usingvarious NCA algorithms. The size of the networks we tested and the accuracy of the reconstruction increasedsignificantly. Importantly, FOXA1, ATF2, ATF3 and many other known key regulators in breast cancer could not beincorporated by any NCA algorithm because of the necessary conditions. However, their temporal activities could bereconstructed by our algorithm, and therefore their involvement in breast cancer could be analyzed.Conclusions:Our framework enables reconstruction of large gene expression data networks, without reducing theirsize or pruning potentially important components, and at the same time rendering the results more biologicalplausible. Our ISNCA method is not only suitable for prediction of key regulators in cancer studies, but it can beapplied to any high-throughput gene expression data.