Award details

The non-coding Arabidopsis genome

ReferenceBB/J00247X/1
Principal Investigator / Supervisor Professor Gordon Simpson
Co-Investigators /
Co-Supervisors
Professor Geoffrey Barton
Institution University of Dundee
DepartmentSchool of Life Sciences
Funding typeResearch
Value (£) 792,345
StatusCompleted
TypeResearch Grant
Start date 01/07/2012
End date 31/12/2015
Duration42 months

Abstract

Genome-wide RNA expression analysis reveals pervasive transcription of non-coding RNAs (ncRNAs). Evidence that these RNAs can control transcription or splicing challenges our views of how the genome functions and how it should be annotated. We recently used 3rd generation sequencing technology, DRS, to directly sequence Arabidopsis RNA 3' ends. As early access users, we made unexpected discoveries in relation to the hidden transcriptome controlled by the exosome and previously unannotated ncRNAs that map upstream of the 5' end of annotated genes. Here we propose to use our expertise in DRS and RNA molecular biology to define the nature and functions of the non-coding Arabidopsis genome. To reveal the hidden transcriptome we will use DRS to analyze exosome knock-down lines. Then we will use deep sequencing of poly(A) neutral RNAs, together with genome-wide analysis of chromatin marks associated with transcription initiation and elongation, to annotate long non-coding RNAs. Since such RNAs can function as scaffolds in epigenetic repression, we will identify Arabidopsis ncRNAs likely to function in this way by sequencing RNAs associated with Polycomb Repressor Complexes. By genome-wide analysis of chromatin marks of transcription enhancer elements, we will determine whether the novel ncRNAs we identified upstream of annotated gene 5' ends correspond to enhancer RNAs. Finally, our current DRS data revealed multiple issues of misinterpretation of previous expression datasets that can be explained by a huge class of unannotated non-coding snoRNAs. Therefore, in order to prevent the recurrence of such errors, we will provide definitive functional annotation of this large gene family here. This work will have wide-ranging impacts because the application of this leading edge technology to annotate and assign function to the non-coding Arabidopsis genome will also provide a pathfinder model for the annotation of crop plant genomes vital to food and energy security.

Summary

What we used to think goes something like this: Our chromosomes are made of DNA and embedded within them are our genes. When our genes are switched on, they are converted into a related molecule called RNA, which eventually converts the code of our genes into proteins that carry out all the tasks in our cells. Much of the DNA in our chromosomes doesn't encode such genes and is therefore junk. But we think of things a little differently now: Although the human genome was sequenced 10 years ago we are still working out what it codes for. To our surprise much more of it is copied into RNA than that which codes for proteins. We call these non-coding RNAs and it turns out that many play critical roles in controlling how protein coding genes are switched on, off, or modified. We now realize that to understand how genes and genomes work, we need to discover non-coding RNAs and work out what they do. My lab is interested in how plants control flower development. We have discovered that non-coding RNAs control when flowers are made and this has got us interested in non-coding RNAs. DNA sequencing has been revolutionized recently by so-called next generation sequence technology. This approach can be used to sequence copies of RNA too, allowing us to identify all the RNAs made in a cell. Copies of RNA are made by something called RT, so we almost never look at RNA directly. Unfortunately RT can make mistakes so our interpretations about RNA can be wrong. Last year a company called Helicos Biosciences developed a 3rd Generation sequencing technology that can sequence RNA directly. We have collaborated with this company to sequence the ends of RNA molecules in Arabidopsis (the first plant to have its entire genome sequenced). While doing this we also found lots of non-coding RNAs and lots of problems with previous experiments that used RT to look at RNA. In this proposal we want to build on our breakthroughs with this technology to discover and annotate the non-coding RNAs of Arabidopsis. As the annotation of all other plant genomes largely derives from Arabidopsis, this will be invaluable in understanding other plant genomes, including the crops that we depend on for food and fuel. We will look for hidden RNAs and long non-coding RNAs. Some non-coding RNAs are destroyed almost as quickly as they are made, so they are normally effectively hidden. But they can be seen in plants that lack the exosome - a group of proteins that destroy RNAs. We have already shown there are lots of mistakes in a previous attempt to look at these RNAs, so we know that our approach is a great way to show these RNAs in accurate detail. The RNA sequencing we have already carried out identified lots of RNAs not previously found before. The trouble is our sequencing only identified the ends of these RNAs and we don't know the nature of the whole RNA molecule. A recent study of human non-coding RNAs found it helpful to fragment RNA, sequence, and compare special marks on chromosomes found where genes start and where the 'body' of a gene is. We will take the same approach here with Arabidopsis, but sequence RNA directly rather than RT made copies. In this proposal, we will also try to identify a function for a large number of the RNAs we find. In humans, long non-coding RNAs act as a scaffold for special proteins that switch genes off. We will find RNAs that bind to these proteins in Arabidopsis. My lab has special expertise here, as we were among the first to show how to do this in Arabidopsis. Finally, our recent breakthroughs identified a set of non-coding RNAs that mostly do a quite well worked out job: snoRNAs help modify other RNAs. Unfortunately, we found lots of problems in the way people have looked at Arabidopsis non-coding RNAs that can be explained by snoRNAs and RT mistakes. To prevent this happening in the future, we will sort out the annotation of these RNAs.

Impact Summary

1. Cultural Life. Our work defines a new area of science. This curiosity-led discovery of new knowledge is a feature that the UK public expect of their scientists as GGS experienced when he spoke about non-coding antisense RNAs at a BBSRC organized public engagement event at the Edinburgh International Science Festival in 2010. 2. Agricultural Industry. Our work benefits the development of world agriculture in several distinct ways. First GGS's lab is training a new generation of plant scientists, familiar with working with genetics, making crosses and phenotyping plants. Second, we are training biologists used to working in multi-disciplinary teams, combining the iterative interaction of bench scientists with bioinformaticians. Third, through analysis of huge sequencing datasets we are drawing into plant biology, scientists from mathematical and physics backgrounds, who bring with them quite different skill-sets and insight that can be highly beneficial to understanding plant biology and hence crop science. 3. World Economy. Dundee takes the training of PhD and Post-Doctoral scientists particularly seriously and has a specific department called 'Generic Skills' that delivers 'non-bench' training in, for example, public speaking and public engagement. Dundee provides a highly international working environment with staff from over 50 different nationalities working in the College of Life Sciences. Dundee also houses the 3rd largest biotech cluster in the UK with an entrepreneurial culture of spin-out companies from Life Sciences. Together, these aspects of research life in Dundee provide rounded, highly skilled and educated employees to the international work-force. For example, among recent alumni from GGS's lab who came to Dundee from overseas, C. Hornyik has gone on to hold a P.I. position in crop science in the UK and L. Terzi a managerial position in a Swiss pharmaceutical company. 4. Society Through Public Engagement. This proposal relates to fundamental understanding of plant genome organization. The GM controversy highlights the importance of public understanding and support for the research we do. GGS became responsible for Dundee Plant Sciences impact activities in 2010 and since then the Division has developed links with Dundee's Botanic Garden and a hands-on DNA extraction activity to communicate information about plants having genes. This has involved all members of GGS's lab and most members (83%) of the Division of Plant Sciences in 2010. Further links with the Botanic Garden are planned through the development of exhibitions and a 'Genetics Garden' to the mutual benefit of ourselves and the Garden. In this proposal we specifically describe an 'Arabidopsis Genome Scroller' Installation, to complement a highly successful human genome public engagement display developed by GJB's group. Using this as a centerpiece, we will communicate generic ideas about genetics, genomics and bioinformatics and specifically introduce our research interest in non-coding RNAs to the general public. The work of our groups in public engagement is not unique, but part of the culture of Dundee University College of Life Sciences, supported by a specific University department called 'Revealing Research' and exemplified by our involvement in the current BBSRC 'Excellence with Impact' competition.
Committee Research Committee D (Molecules, cells and industrial biotechnology)
Research TopicsPlant Science
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file