Award details

Ensembl genome portal for farm and companion animals

ReferenceBB/M011615/1
Principal Investigator / Supervisor Dr Paul Flicek
Co-Investigators /
Co-Supervisors
Ms Bronwen Aken
Institution EMBL - European Bioinformatics Institute
DepartmentEnsembl Group
Funding typeResearch
Value (£) 413,909
StatusCompleted
TypeResearch Grant
Start date 01/08/2015
End date 31/07/2019
Duration48 months

Abstract

High quality annotated reference genome sequences are essential bioinformatics resources for 21st century biological research. Draft reference genome sequences have been established for several farmed and companion animal species - chicken, cattle, sheep, goat, pig, turkey, duck, dog, horse, and most recently rainbow trout. In addition draft genome sequences for Atlantic salmon, Indicine cattle and water buffalo will be released in the near future. However, unannotated genome sequences are not immediately useful to biologists. Similarly, genome assemblies which are incomplete or for which the annotation is dated hinder progress in biological research. This proposal is concerned with using the Ensembl system to establish high quality annotation of the genomes of farm and companion animal species, including poultry and farmed fish, and maintain its currency. We will annotate new or improved genome assemblies for farm and companion animals prioritising the genomes of cattle, sheep, pigs, chickens, salmon and dogs. We will acquire sequence data being generated by the research community in experiments to characterise the extent of gene expression in different cells or under different conditions (transcriptomics, RNA-seq) or of the state of the genome (epigenomics, histone marks, methylation states) or to identify transcription start sites (CAGE) or transcription factor binding sites (ChIP-seq). We will use these data to enhance the functional annotation of the target species genomes and to make the resulting annotated genomes freely available to the research community via Ensembl. Similarly, we will acquire data that provide evidence for genetic variation within species - SNPs, indels and structural variants - and display the variation in its genomic context. We will generate comparative genomics resources including pairwise genome alignments and gene trees. We will provide training in the use of the Ensembl genome browser and associated tools.

Summary

Research on domesticated animals has important socio-economic impacts, including underpinning and accelerating improvements in the animal sector of agriculture, contributing to medical research by providing animal models, improving animal health and welfare and informing understanding of natural and wild animal populations. Knowledge of the genes that shape farm and companion animals is essential for such research. The sequence of almost all genes (a draft genome sequence) has been determined for major farmed and companion animal species such as cattle, goats, sheep, pigs, chickens, ducks, turkeys, dogs and horses. Draft genome sequences are also available, or soon will be, for several important fish species, including cod, rainbow trout, salmon and tilapia. However, the strings of billions of bases (symbolised as four letters A, C, G, T) that constitute these genome sequences are not immediately useful to biological research scientists. Annotating these draft genome sequences with features such as where the coding and regulatory parts of genes are located, and the bases which differ between individuals within a species (genetic variants) greatly enhances the value and utility of the genome sequence. Visualising the genome sequences complete with annotations in a freely accessible manner further improves the value of the information. The web-mounted Ensembl genome browser, databases and associated annotation tools have been shown to be powerful and effective means of annotating the complex genomes of animal species including humans, mice and more recently farmed and companion animals. This project is concerned with improving the quality of genome annotation for farmed and companion animal genomes. International consortia of scientists are using so-called next generation sequencing technologies, not only to sequence the genomes of more economically important species, but also the genomes of multiple individuals for each species of interest and to improve or finishthe reference genome sequences for key species. These new sequencing technologies are also being used increasingly in assays, for example, of the extent of gene expression in different cells or under different conditions (transcriptomics) or of the state of the genome (epigenomics). Mapping the sequence read-outs from these assays back to the relevant genome sequence not only provides a genome-wide framework for analysis but also provides further information with which to annotate the genome sequence itself. Thus, there is a recurring need to refresh the genome sequence annotation for important animal species. We will use the Ensembl system to annotate the genome sequences of key farmed and companion animal species. The resulting annotated genome sequences will be made freely available as resources mounted on the World Wide Web. A high quality annotated reference genome sequence is a key source of information and critical bioinformatics resource for the effective prosecution of contemporary research in the biological sciences. This key information is valuable not only to academic researchers, but also to scientists working in industry, including those in the animal breeding, animal health and pharmaceutical sectors. However, the value and utility of such bioinformatics resources are critically dependent upon the currency of the resource. Thus, this project is concerned with delivering high quality up-to-date annotated reference genomes for key farmed and companion animal species to enable research on these economically or socially important animal species.

Impact Summary

Who will benefit? The primary beneficiaries from this proposed development and maintenance of Ensembl resources for farmed and companion animals will be researchers in academia and industry in the UK and beyond. The access statistics and citations of Ensembl papers provide evidence of the demand for Ensembl resources from the research community. Research on domesticated animals has important socio-economic impacts, including underpinning and accelerating improvements in the animal sector of agriculture, contributing to medical research by providing animal models, improving animal health and welfare and informing understanding of natural and wild animal populations. The world's leading animal breeding and aquaculture breeding companies, of which some of the largest are UK companies, have in-house genetics expertise. Thus, these companies have the expertise to exploit the information captured and disseminated through Ensembl resources. Evidence of the value of animal genome sequences to the pharmaceutical sector is provided by their recent investments in sequencing pig and dog genomes. Suppliers of species specific 'omics tools such as expression arrays, SNP chips and proteomics system will benefit from access to annotated genomes sequences which include links to features (e.g. probes) on their products. There are potential indirect benefits to the wider public through the addressing of the food security agenda as discussed below. How will they benefit? The proposed enhanced Ensembl resources, especially the genetic variation resources, will enable research to dissect the genetic control of economically important (and complex) traits in farmed animals including feed efficiency and susceptibility to infectious diseases. In companion animals such as dogs these resources will enable the identification of the determinants of inherited diseases. This enabling of genetics research in farmed animals and fish will facilitate advanced genetic improvement for these species.Genetic improvement of farmed animal species is a key means of addressing the food security agenda for the animal agriculture and aquaculture sectors. In companion animals the benefits will be improved tools for selective breeding to minimise inherited diseases and inbreeding and to improve animal welfare. The utility of 'omics technology products such as expression microarrays and SNP chips is greatly enhanced when the features on these products can be linked to a well-annotated genome sequence and other information sources. For example, probe sets for Affymetrix arrays and SNPs on Affymetrix and Illumina chips can be linked to annotated genes and genome locations respectively, thus enabling more effective use of these products. Well-annotated genomes facilitate the design of capture probes for exome sequencing; current developers of such products include Agilent and Roche Nimblegen. Academic and other researchers will benefit from the ability to link the read-out from assay by sequence assays to an annotated genome sequence. Without such a frame of reference such assays are of limited value. The impacts on research will be delivered within the timeframe of the proposed project to enhance Ensembl resources for farmed and companion animals and continue thereafter. Maintaining the currency of the genome assemblies and the associated annotation is critical to ensuring that these impacts continue to be effective. The indirect impacts, for example, on the food security agenda and hence the benefits to the agriculture and aquaculture sectors and the wider public will take longer to be felt. However, the time to impact for genetic tests for susceptibility to inherited or infectious diseases in animals with their positive impacts on animal welfare can be short - 1 to 3 years.
Committee Not funded via Committee
Research TopicsAnimal Health
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file