Award details

Quantifying functional constraints in the mammalian genome

ReferenceBB/D015480/1
Principal Investigator / Supervisor Professor Peter Keightley
Co-Investigators /
Co-Supervisors
Professor Adam Eyre-Walker
Institution University of Edinburgh
DepartmentInst of Evolutionary Biology
Funding typeResearch
Value (£) 550,416
StatusCompleted
TypeResearch Grant
Start date 01/09/2006
End date 30/04/2010
Duration44 months

Abstract

The great organismal complexity of mammals is believed to be determined to a large extent by the complexity of gene regulation. The elements that control the timing and specificity of gene expression are for the most part located in noncoding DNA, but are typically less well conserved than coding sequences, and the understanding of their nature is incomplete. For example, the fraction of the genome that is involved in gene expression control is largely unknown, as is the mode and strength of natural selection that operates on sequence variation. The comparative genomics approach can allow the identification of regulatory regions in noncoding DNA on the basis of evolutionary conservation. In our proposed project, we will develop an approach to estimate the fraction of selectively constrained coding and noncoding sites in the genome that have been conserved deep into the mammalian phylogeny. This will be based on comparisons between outgroup species that allow the inference of potentially conserved elements, then comparisons of more closely related species to estimate the fraction of selectively constrained sites in these elements. In particular, we shall compare genome-wide levels of selective constraints between murids and hominids, for which our previous work has indicated a substantially lower effectiveness of purifying selection in hominids. In parallel, we shall obtain a large polymorphism data set from wild mice that will enable us to estimate the distribution of effects of deleterious mutations in both coding and noncoding regions, using statistical methods that we have recently developed for this purpose. These data will also allow us to disentangle mutation rate variation in the genome from purifying selection as causes of conservation in putative gene regulatory regions.

Summary

Very recently, the complete DNA sequences of several mammalian genomes have been made available to the scientific community. These include the genome sequences of human, chimpanzee, macaque, dog, mouse and rat. Genomes contain genes that code for proteins, which are the building blocks of all living things. However, it has long been known that more than 98.5% of the mammalian genome consists of 'noncoding' DNA that does not code for proteins. Noncoding DNA is nonetheless important because it contains sequences that control the 'expression' of genes; that is, when and in what cells and tissues genes produce proteins. Gene expression control sequences are therefore of great interest to biologists, yet very little is known about how much of the genome consists of these sequences and where they are located in the genome. In our proposed project, we shall attempt to find out where the important gene expression control sequences are located in mammalian genomes. We will do this by comparing the genome sequences of several mammals. We will search for those parts of the genome that have remained similar to each other, and therefore have retained common functions, over the many tens of millions of years of mammalian evolution. As part of these comparisons, we shall measure and compare the amount of similarity in gene expression control sequences in the genomes of apes and rodents. Our previous work has suggested that gene expression control sequences are less strongly conserved in apes than rodents, suggesting that natural selection has been less effective in apes. We also propose to obtain DNA sequences from genes and gene expression control regions from individuals of a population of wild mice from India. We are proposing to study an Indian population because these mice are highly genetically variable. We expect to find differences between individual mice in their DNA sequences. The numbers of DNA sequence differences will allow us to estimate how much the sequence differences affect the reproductive success of individual mice.
Committee Closed Committee - Genes & Developmental Biology (GDB)
Research TopicsX – not assigned to a current Research Topic
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file