Award details

Advancing Bayesian network algorithms for inferring gene regulation using an integrative computational-biological approach in a yeast model system

ReferenceBB/F001398/1
Principal Investigator / Supervisor Dr Victoria Smith
Co-Investigators /
Co-Supervisors
Institution University of St Andrews
DepartmentBiology
Funding typeResearch
Value (£) 563,885
StatusCompleted
TypeResearch Grant
Start date 15/02/2008
End date 14/02/2012
Duration48 months

Abstract

The advent of large amounts of biological data has spurred much computational research in analysing this data to understand biology on a systems level. However, as computation and biology are often performed by separate groups, there is little interplay between computational development and biological experimentation; this leads to computational tools whose biological validity is unknown. I propose to rectify this issue, by integrating biological experimentation with the computational development task. The proposed research concentrates on developing algorithms for revealing gene regulatory networks. Variations in gene regulation are responsible for tissue differences, developmental change, some disease states such as cancer, and have been suggested to be a main substrate for evolutionary change. Thus, algorithms capable of accurately revealing gene regulatory networks could have impact in many areas of biology. Current algorithms for the genetic network inference task generally consider data from only RNA expression, not protein expression. However, translational regulation may be an important feature of gene regulation; thus, I propose to develop a Bayesian network inference algorithm which can model both transcriptional and translational regulation using RNA and protein data. I will additionally incorporate the ability to use other sources of information, such as location data from ChIP-chip experiments, in the network inference task. The algorithm will be developed iteratively along with tests in a simulation framework and with biological intervention experiments in yeast, S. cerevisiae. Simulation tests will enable characterisation of the algorithm's performance across a range of situations and reveal areas to target for improvement. Biological manipulation, adjusting the level of putative regulators by incorporating inducible promoters into the genome and then measuring putative targets, will enable biological verification of algorithm performance.

Summary

Recently it has become possible to collect large amounts of data in biology, for example, measuring the expression level of every gene in yeast. This large amount of data in biology has spurred development of computational tools to analyse it. Such data and computational tools enables us to look at biology at a broader level than previously possible: we can examine a large number of interacting elements, instead of doing directed experiments on only a few, enabling investigation into how the entire system behaves. One area of such work is to use computational algorithms to reveal gene regulatory networks. Gene regulation is when a protein--known as a regulator--binds to the DNA near a gene and affects how that gene expressed, either increasing or decreasing the amount of RNA produced. This RNA is then used to make the protein product of the gene. So the binding of the regulator near the gene ultimately affects the amount of protein the gene makes. The regulator is also a protein, and thus was also produced by a gene making RNA making protein. In fact, the regulator could have a regulator of its own. A gene regulatory network is a network formed by proteins that are regulators for other proteins, which either perform some function in the cell or are regulators for yet more proteins. Even though a regulatory network consists of steps going from genes to RNA and RNA to protein, current algorithms use data from only RNA, not proteins. This is mostly because RNA measurement is easier, and thus data is available. However, protein measurement is improving, and it may be important to consider the RNA to protein transition, as regulation could occur at this step too. Here, we propose to improve algorithms that reveal gene regulatory networks by including protein data. Additionally, there is a lot of other information available that might help us figure out the gene regulatory network: locations where regulators have been found to bind to DNA, what genes are near DNA sequences to which we know regulators bind, what proteins bind to each other, and what genes changed expression when another gene was manipulated. We will also add all of these pieces of information into the algorithm, in an effort to take maximal advantage of the available information to accurately predict gene regulatory networks. But making an algorithm that ought to do things is not the whole story--we also have to test it. We will test the algorithms we develop in two ways. First, we will use a simulation, where we make up a gene regulatory network, sample data from it like we are doing a biological experiment--but in the computer, and then see if the algorithm can figure out the gene regulatory network we made. This step helps us figure out where we got things right, when the algorithm finds the correct network, and where we got things wrong, when the algorithm makes mistakes. We can then work on fixing the algorithm to make fewer mistakes. Second, we will take the algorithm we have tested in the simulator, and made as good as we can, and apply it to data taken from yeast in biological laboratory. The algorithm will output a network showing what it predicts to be the gene regulatory network based on the data. We will then pick pieces of this network, such as a regulator and gene pair, to test in our own yeast experiment. These tests will tell us if the algorithm is making accurate predictions or not. This type of validation, while important, is rarely performed because different people usually make the algorithms than do the biology. Thus, the proposed research meets this often-missed need. The ultimate goal of this research is to produce an algorithm that does a good job of predicting gene regulatory networks. Once we have this algorithm, future research can use it to measure gene regulatory networks and study their features. In particular, we plan to use the algorithm produced here to study the evolution of gene regulatory networks in future projects.
Committee Closed Committee - Engineering & Biological Systems (EBS)
Research TopicsMicrobiology, Systems Biology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file