Data Driven Protein Structure Prediction: Development of a SAXS ab initio modeling in a probabilistic inference framework
MetadataVis full innførsel
Protein structure prediction is one of the most interesting and researched targets in computational biology. The problem in itself is easily posed: to model the three-dimensional structure of a protein, given its genetic code. While we know that the genetic code is in fact the ultimate descriptor for the sequence of amino acids in the protein, a large number of factors concur to determine its structure. These factors act with reciprocal non linear interactions, so a complete mathematical model that describes this process is extremely difficult to define. Moreover, experimental data is not easy to obtain: the state-of-the-art high resolution methods of investigation either rely on crystal structures or electromagnetic radiations that have sensible limitations and don't scale well with the protein size, or force the protein in an environment that actually alters in a sensible way the three-dimensional structure to analyze. To overcome these limitations, an increasing number of experiments now combine different low resolution and less invasive techniques, thus shifting part of the description problem to the definition of an objective model of the different input contributions. In this thesis an inferential probabilistic approach is proposed, linking a rigorous mathematical description of the local chemical bonds of the protein with an improved model for the input data acquired from low resolution Small Angle X-ray Scattering experiments, which provide global information about the three-dimensional molecular structure.