Award details

A Novel Comparative Method for Locating Human Conserved DNA

ReferenceBB/D005418/1
Principal Investigator / Supervisor Professor Jotun Hein
Co-Investigators /
Co-Supervisors
Institution University of Oxford
DepartmentStatistics
Funding typeResearch
Value (£) 60,480
StatusCompleted
TypeResearch Grant
Start date 01/10/2005
End date 31/03/2007
Duration18 months

Abstract

Functional genetic elements are under purifying selection, and undergo fewer mutation events than non-functional neutrally evolving material. This can be exploited to locate conserved and putatively functional DNA. Current methods are primarily based on models of nucleotide substitution, and are successful at localizing protein-coding genes. However, many other types of functional elements, such as transcription-factor binding sites and RNA genes, are less conserved at the sequence level, and are harder to find. This proposal concerns a novel method that is orthogonal to current methods, and focuses exclusively on insertion and deletion events. Preliminary work showed that the method is highly sensitive, and allows to find material that is weakly conserved at the sequence level, but is under purifying selection with respect to indels. We have e.g. shown a high sensitivity for localizing RNA genes, despite a generally low sequence-level conservation of such genes. A preliminary analysis further revealed a novel class of high-period tandemly repeated sequences whose structure is strongly conserved, and which are highly overrepresented in subtelomeric regions. The method may serve as the foundation of more specific automatic annotation method. This pilot project proposes to set up a large, structured database of conserved elements in the human genome, to allow this data to be flexibly combined with downstream annotation methods, and to foster internet-enabled collaborations building further onto the novel method.

Summary

Recently, the human genome, as well as several other mammalian genomes, have been completely sequenced. This gave rise to the field of comparative genomics, which aims to identify biologically functional elements by comparing related genomes and observing evolutionary patterns. It was hoped that this approach would quickly lead to breakthroughs in biological and medical research. Although many fundamental discoveries have since been made, progress has been slower than previously imagined. For example, of the estimated 5% of the human genome has some biological function, only about 1.2% has currently been identified, most of it as protein-coding gene. Other known classes of functional genetic material include e.g. RNA genes and regulatory elements, both of which are both much harder to locate. These elements are known to be involved in genetic disorders, and annotating this material is a necessary step toward screening and treatment, aside from being of obvious fundamental scientific interest. Genetic mutations mostly fall into two groups: nucleotide substitutions, and sequence insertions or deletions (indels). In mammalian genomes, substitutions are more frequent than indels by about tenfold, and much effort has been directed at accurate modelling of this process. However, the indel process also leaves characteristic patterns at functional sites, and these patterns have not been investigated as much. The proposed research will address some of the difficulties currently encountered by comparative genomics, by developing a novel method focusing solely on indels. Preliminary work showed that the method is highly sensitive, and allows to find material that is weakly conserved at the sequence level, but is under purifying selection with respect to indels. We have e.g. shown a high sensitivity for localizing RNA genes, despite a generally low sequence-level conservation of such genes. The method may serve as the foundation of more specific automatic annotation method. This pilot project proposes to set up a large, structured database of conserved elements in the human genome, to allow this data to be flexibly combined with downstream annotation methods, and to foster internet-enabled collaborations building further onto the novel method.
Committee Closed Committee - Genes & Developmental Biology (GDB)
Research TopicsX – not assigned to a current Research Topic
Research PriorityX – Research Priority information not available
Research Initiative EDF (e-science Development Fund) (EDF) [2003-2005]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file