Award details

PhytoPath: an integrated resource for comparative phytopathogen genomics

ReferenceBB/I001077/1
Principal Investigator / Supervisor Dr Paul Kersey
Co-Investigators /
Co-Supervisors
Dr Ewan Birney
Institution EMBL - European Bioinformatics Institute
DepartmentEnsembl Genomes
Funding typeResearch
Value (£) 433,595
StatusCompleted
TypeResearch Grant
Start date 15/11/2010
End date 14/11/2013
Duration36 months

Abstract

We will establish a resource (PhytoPath) for high throughput molecular biology data from important phytopathogens and pathogenic phenotypes based on the Ensembl software infrastructure for genome analysis and display and PHI-base, the leading resource describing phenotypes of pathogenic infection. Information will be stored in a relational database and made available through a number of public interfaces, including a genome browser, a query optimised data warehouse, and bulk data download. Services will be operated as an integrated part of the EBI's suite of public services, and integrated with other services offering access to genome-scale data from other species (e.g. the plant hosts of pathogen-mediated disease). PhytoPath will be run by a management board comprising key members of the U.K. phytopathogen research community, and will initially prioritise the incorporation of data from selected fungal and oomycete pathogens of particular interest in the U.K. Data types of interest to PhytoPath include genome sequence, variation information, functional and regulatory assays, ESTs, transcriptomic and proteomic data. Priority species for inclusion in the first release are Magnaporthe grisea, Mycosphaerella graminicola, and Phytophthora infestans. Subsequently, data will be selected for incorporation according to the current research priorities of the community. We will develop methods for population-scale variation analysis, and comparative genomic analysis between pathogenic and related non-pathogenic species, building the domain specific expertise of project partners); and include the results in each release of the database. We will also develop a new interface to support annotation of host-pathogen interactions by community users, and develop a new interface linking genotypes in host and pathogen species with the disease phenotypes.

Summary

Food security has emerged as one of the most significant challenges for humankind in the 21st century. Food shortages, high energy costs, conflicting demands on crop production for biofuel generation and the soaring demand for food from east and south-east Asia are combining to drive food prices to their highest levels for many years. A significant constraint on crop productivity is disease, which accounts for 10-20% losses in yield every year. Controlling plant diseases furthermore represents a significant cost to farmers both in time and resources. The development of new durable disease control strategies that can be deployed at low cost therefore represents one of the best means of ensuring sustainable food production. Plant pathogens (and other species, including the plants they afflict) are increasingly studied through the use of high throughput, automated experimental approaches that generate large quantities of data. For over ten years, the determination of the sequence of complete genomes (that is, all the information that determines the heritable characteristics of a species) has been possible. More recently, advances in technology have reduced the costs of genome sequencing drastically, and made possible the determination of individual genome sequences (thus allowing the sampling of populations to determine their characteristics). Similar improvements in technology have increased the quantity of data produced describing the expression of genes and proteins in a variety of experimental conditions. However, while public repositories exist for certain types of these new experimental data, there is no integrative resource available that unifies these to facilitate their interpretation. In the absence of such a resource, there is (at worst) a danger that data generated by new technologies is lost; or alternatively that every scientist wishing to exploit such data has to tediously and wastefully integrate and correct information from different data sets. Formany scientists, the determination of a coherent, up-to-date body of data from different experiments is a near-impossible challenge, and a distraction from the challenge of using such information to solve real scientific problems. The Ensembl software platform comprises a suite of tools for the analysis, integration and display of data from complete genomes. It includes modules for the handling of population-wide genome variation amongst individuals, and the evolutionary comparison between species. The platform has been used to capture genomes from many species including vertebrates and plants. We now propose creating PhytoPath, a new resource based on Ensembl technology to capture data from phytopathogen genomes, in response to the increased interest in food security and the concomitant increase in high throughput data available for pathogens of interest. PhytoPath will be run by the EBI, Europe's leading bioinformatics service centre, but will take its scientific direction from members of the UK phytopathogen research community, who are directly engaged in producing and exploiting these data. The use of the Ensembl platform is not only cost-effective (taking advantage of solutions already developed for use in other contexts), but also offers the exciting prospect of providing access to host, pathogen (and vector) genomes through a common interface. Particularly, this will facilitate the development of a new type of resource, correlating phenotype (i.e. the symptoms of pathogen-mediated disease) in with genotype (i.e. individual genome sequence) in both host and pathogen. The leading current resource for plant disease phentoypes is the pathogen-host interactions database (PHI-base), maintained by Rothamsted Research. We will develop a new interface for supervised community curation of PHI-base and integrate PHI-base tightly within PhytoPath to ensure that the pathogen phenotype can be studied in its genomic context.

Impact Summary

As pressures on global food supplies grow, agriculture is of increasing economic value; and of huge significance to the large proportion of the world's population likely to face hunger if food production cannot be increased. PhytoPath will be extensively used by the agricultural biotechnology/agrochemical industry, which has considerable demand for a unified, centrally-curated database of genomic information for key plant pathogenic species. The development of an integrated resource aligning genomic, transcriptomic and proteomics data from both pathogen and host will provide a platform for researchers wishing to apply a systems approach to pathogen-mediated disease. PhytoPath will provide a framework for its management, a single home for community - centric curation of the scientific literature, and provide bidirectional links between the two. It will permit the rapid identification of the conserved and non-conserved evolutionary themes of biology. Other potential uses include enabling reviews on plant / animal pathogenesis; analysing the results of forward and reverse genetics experiments, validating gene models (against EST libraries or RNA-seq data), analysis of proteomics and protein-protein interaction studies, and the design and development of diagnostic markers with biological relevance for each species. All of these uses are invaluable to the agro-chemical industry. Links to the Fungicide Resistance Action Committee's website will integrate information about known target sites on top of the genomic/phenotypic information in PhytoPath itself. All data and software from PhytoPath will be available for universal re-use without restriction, allowing companies to integrate their own private information with that already in the public domain within a secure environment (although we will encourage companies to release pre-competitive information through the public site). In a wider sense, the potential benefits of the application of ultra-high throughput sequencing technologies to plant pathogens for target identification, include the potential for significant reduction in the cost of disease control and increases in achievable yields (benefiting both food and biofuel production). The economic and quality-of-life benefits of such advances are massive. While the development of PhytoPath is in itself only a small step towards a new green revolution, the proper management and integration of genome scale data is crucial for its correct interpretation and downstream exploitation. The partners will provide training to industrial (as well as to academic users). Additionally, both partners have strong track records for engagement with industry, both through specific collaborations and also through the EBI's Industry Programme and Small Medium Enterprise Forum, which provide an opportunity for the commercial sector to convey their current and future requirements of EBI services. We will additionally promote the project to the general public (through science fairs and press releases, and through inviting a student to create educational materials based on the project); and to potential downstream beneficiaries (e.g. meetings aimed at the agricultural community).
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsCrop Science, Plant Science, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file