Award details

Ensembl - adding value to animal genomes through high quality annotation

ReferenceBB/S02008X/1
Principal Investigator / Supervisor Professor Alan Archibald
Co-Investigators /
Co-Supervisors
Dr Emily Clark, Dr Lel Eory
Institution University of Edinburgh
DepartmentThe Roslin Institute
Funding typeResearch
Value (£) 368,856
StatusCompleted
TypeResearch Grant
Start date 01/08/2019
End date 31/07/2022
Duration36 months

Abstract

High quality annotated genomes are essential resources for life sciences research. Draft reference genome sequences have been established for several farmed and domesticated animals: cattle, goat, pig, sheep; chicken, duck, turkey; dog, horse; rainbow trout, salmon, tilapia. Substantially improved genome assemblies have been established for goat, pig, cattle, sheep, water buffalo, chicken) using long read sequencing technologies. There are gaps in the annotation of these genomes in terms of transcript complexity, non-coding genes, pseudogenes and regulatory sequences. Moreover, the pseudo haploid genome sequence of one individual provides an incomplete view of a species' genome. Scientists are generating more and better genome sequences for additional species and individuals within a species. Researchers, especially in the FAANG and FAASG consortia are generating functional data for annotation of coding, non-coding and regulatory sequences. We will analyse and annotate farmed and domesticated animal genomes as they are released, exploiting the growing volumes of functional data (short and long read RNA-seq / transcript sequences; ChIP-seq; ATAC-Seq; CAGE; bisulfite sequence) to identify coding genes, non-coding genes and regulatory sequences. We will acquire data from re-sequencing projects to characterise genetic variation within species (SNPs, indel, structural variants) and display this variation in its genomics context. We will run comparative genomics analyses both between species and within species. We will disseminate the resulting richly annotated genome sequences freely via the Ensembl Genome Browser and via an API for power users. These annotated genomes will provide an integrated view of functional sequences (coding, non-coding and regulatory) and sequence variation for a single or multiple individuals for key farmed and domesticated animals. To maximise use of this resource we will provide demonstrations, on-line and face-to-face training.

Summary

This project will deliver high quality up-to-date annotated genomes for key farmed and domesticated animals to enable research on these economically and socially important species. Research on domesticated animals has important socio-economic impacts, including underpinning and accelerating improvements in the animal sector of agriculture, contributing to medical research by providing animal models, improving animal health and welfare and informing understanding of natural and wild animal populations. High quality annotated genome sequences are key resources to enable such research. The sequence of almost all genes (a reference genome sequence) has been determined for major farmed and domesticated animal species such as cattle, goats, sheep, pigs, chickens, ducks, turkeys, dogs and horses as well as for several important fish species, including cod, rainbow trout, salmon and tilapia. However, the strings of billions of bases (symbolised as four letters A, C, G, T) that constitute these genome sequences are not particularly useful or understandable on their own. Once a genome has been sequenced, it needs to be 'annotated' (i.e. explanatory notes need to be added to identify key features within the genome sequence) in order for research scientists to make sense of it. Annotating reference genome sequences with features such as where the coding and regulatory parts of genes are located, and the bases which differ between individuals within a species (genetic variants) greatly enhances the value and utility of the genome sequence. Visualising the genome sequences complete with annotations in a freely accessible manner further improves the value of the information. Ensembl provides a means for researchers to look at or 'browse' the annotated genome information. The databases and tools provided by Ensembl have been shown to be a powerful and effective means of annotating the complex genomes of animal species including humans, mice and more recently farmed and domesticated animals. Enabled by advances in genome sequencing technologies and associated computational methods scientists around the world are generating more and better genome sequences. As the genome sequence of a single individual does not completely represent the genetic make-up of a species, scientists are also sequencing multiple individuals within a species. Individual research groups and international consortia are also generating sequence information that can be used in the annotation and analysis pipelines that we will run to identify both coding and regulatory sequences. We will use these data to annotate the genomes of farmed and domesticated animals, including aquaculture species. We will run comparative analyses to compare genomes both between species and between individuals within a species. These richly annotated genome sequences, which are in effect maps of where the coding gene content and regulatory sequences are located, will be made freely available to the scientific community and others via the Ensembl Genome Browser mounted on the World Wide Web as well as via an Application Programming Interface for power users. We will also provide between and within species comparative views. The annotated genomes that we will deliver are valuable not only to academic researchers, but also to scientists working in industry, including those in the animal breeding, animal health and pharmaceutical sectors. Keeping this information up-to-date, by characterising new genome sequences and integrating new data as it becomes available, is essential for reference genome sequences to remain current and useful.

Impact Summary

Who will benefit? We expect that the beneficiaries will include: academic and industry researchers, animal breeding companies, owners or farmed and domesticated animals, suppliers of 'omics tools and wider society. How will they benefit? The Ensembl farmed and domesticated animal resources facilitates research on domesticated animals that has important socio-economic impacts, including underpinning improvements in the livestock sector, contributions to medical research, animal health and welfare, the evolution of domestication and the understanding of natural animal populations. Thus, the benefits to wider society are expected to be the result of research enabled by the Ensembl resources. 1. academic and industry researchers High quality annotated animal genomes enable a wide range of research, including genetics studies of the target species, understanding gene networks that drive developmental biology and the development of improved animal models using precision genome editing. 2. animal breeding companies Genomic prediction and selection has delivered significant improvements in the accuracy of selective animal breeding. Geneticists and breeders in the world's leading animal breeding and aquaculture breeding companies, of which some of the largest are UK companies will have access to improved annotation of functional sequences and sequence variation that will allow them to develop more sophisticated models for genomic prediction. 3. owners of farmed and domesticated animals Owners of farmed animals will benefit indirectly through the supply of superior breeding stock and greater confidence in the genetic merit and performance of the resulting production animals. Owners of domestic and companion animals will also benefit indirectly through the use of genetic tests for inherited conditions developed through research enabled by the Ensembl resources. 4. suppliers of 'omics tools Suppliers 'omics tools such as expression arrays, exon and other sequence captureproducts and SNP chips will also benefit from access to annotated genomes sequences that be used to design species specific products. 5. infrastructure and capacity building The proposed project will contribute directly to capacity building by providing training including demonstrations, online tutorials and workshops in the use of the genome portal. This programme trains PhD students, Post Docs and research scientists to develop their skills in genome annotation, genome browsing and importantly how to interpret and understand their own data. Finally, the Ensembl - farmed and domesticated animals resource contributes to infrastructure to support the life sciences and is recognised as an ELIXIR resource in this context.
Committee Research Committee A (Animal disease, health and welfare)
Research TopicsAnimal Health
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file