Award details

Enriching Metabolic PATHwaY models with evidence from the literature (EMPATHY)

ReferenceBB/M006891/1
Principal Investigator / Supervisor Professor Sophia Ananiadou
Co-Investigators /
Co-Supervisors
Professor Douglas Kell, Professor Pedro Mendes, Dr Rafal Rak, Dr Neil Swainston
Institution The University of Manchester
DepartmentComputer Science
Funding typeResearch
Value (£) 593,910
StatusCompleted
TypeResearch Grant
Start date 25/04/2015
End date 23/03/2019
Duration47 months

Abstract

Recently, the development of genome-scale metabolic models, and their analysis through constraint-based modelling, has increased dramatically, and has been applied to research in human health, drug discovery and biotechnology. These models provide computational and mathematical representations of metabolism in a wide range of organisms, allowing in silico predictions of metabolic processes. Our recent work has automated the construction of draft models for over 2600 organisms from pathway databases. While providing a valuable starting point, these draft models require further manual curation, as current databases lack the coverage of metabolism required to produce detailed, predictive models. The curation process continues to be a time consuming and expensive affair, driven by the need to extract manually the missing details of metabolic processes from literature. Recent reconstruction efforts that have led to high quality models - such as those that we undertook in the development of yeast and human consensus metabolic networks - are heavily reliant on manual mining of literature. This project attempts to reduce greatly the time and expense devoted to manual literature mining by developing infrastructure to support literature-driven model construction. This will be achieved by the introduction of an integrated model development environment to enable users to undertake this process. Crucial to this proposal is the integration of bespoke text mining approaches, which will extract relevant passages from publications, and present them to developers as they expand and refine models. In addition to supporting the model development process itself, users will also be able to provide feedback on text mining suggestions, which will be used to improve their relevance. The task of generating large-scale models for any given organism, including the extraction of biochemical knowledge from literature, will thereby become closer to one which we can fully automate.

Summary

In order to understand living systems, biologists have taken to generating predictive models of the system, allowing them to run computational experiments that reduce the number of more traditional, lab-based experiments that would previously be necessary to gain such an understanding. This approach follows that which is now commonplace in engineering, in which, for instance, aeronautical engineers will develop sophisticated models of aircraft and test safety aspects of the proposed design in a computer, long before developing the aircraft itself (or even putting it in a wind tunnel). This biological modelling approach is named "systems biology" and has been employed successfully in a number of areas. The focus of this proposal is in modelling metabolism. Metabolism is the collection of interconnected chemical reactions that allow cells to extract energy and material from the nutrients that they consume and to grow. All free-living organisms necessarily have such metabolic systems. Thus, modelling human metabolism will allow us to understand the human body's healthy state, for instance as a function of ageing, and aid in the design of chemicals (whether nutrients or drugs) that can maintain human health. In a similar vein, metabolic modelling is also being used in the development of cell factories, which are able to produce industrially relevant chemicals, which are commonly produced by the chemical industry through more traditional means, and often involve the use of oil as a feedstock. This approach (known as fermentation or "industrial biotechnology") is not new - we have been fermenting yeast cells to produce alcohol for thousands of years - but traditional fermentation improvements, lasting decades in the case of penicillins, involved random mutation and selection, often coupled to the incorporation of harmful 'passenger' mutations. However, recent research has shown that metabolic network modelling methods provide a rational approach, both for mature fermentations and for new ones such as bio-isoprene for sustainable car tyre production. Thus, these methods have great value for the sustainable bioproduction of important substances, such as biofuels and fine chemicals. Metabolic modelling therefore has much promise for health and environmental sustainability in this coming century. However, much of the information necessary for the building of these models is held in textbooks, patents and scientific journals, and large teams of researchers are required to search for, judge and extract this information before including it in the models. Thus, the traditional development of such models currently follows (and requires) a time consuming and expensive manual process. Modern methods allow this to be automated. This process of extracting information from the literature can be greatly facilitated by the application of the methods of text mining. Text mining applies sophisticated algorithms to recognise relevant terms and sentences buried in text, and can be trained to recognise those passages of text within a large number of documents that may be relevant to a given application. In this work, we will utilise text mining to extract information necessary for the construction of metabolic network models from the large number of scientific articles that are published daily. The results of these analyses will be presented to model developers, who will judge and extract this information to develop existing metabolic models further. A specific easy-to-use web application will be developed in order to allow a multiple users to contribute towards this model building process, irrespective of their background and previous experience of computational model building. The results of this work will be more complete metabolic models, which will allow researchers to improve understanding of metabolism in a range of organisms, and therefore use this increased knowledge in applications of health and environmental sustainability.

Impact Summary

Although a variety of genome-scale metabolic networks exist, even the most mature are very far from being complete. This project will facilitate the development of genome-scale metabolic reconstructions through the use of advanced text mining approaches. It requires software to support the activity, which will also be created in this project. Who will benefit? The beneficiaries of this research are scientists and teams of scientists that use the computational modelling of metabolism as part of their research. This includes academic researchers, research students, and scientists from industries such as pharmaceuticals, biotechnology, agriculture, cosmetics, health, and fermentation and industrial biotechnology generally. The outcome of the work, enhanced metabolic network reconstructions of a host of organisms, will benefit research in a range of fields, including human health and ageing, and biotechnological approaches to the development of biofuels and high-value chemicals. How will they benefit? The benefits from the outputs of this research will impact the way in which the beneficiaries carry out modelling of metabolic networks to perform in silico experiments. This is an important part of all systems approaches. The actual reconstruction of human metabolism will benefit pharma, as it will help in identifying targets for new drugs. Since we will have developed an improved map of human metabolism, with specificity to various cell types, this will be an invaluable resource for the replacement, refinement and reduction of research using animals (3Rs), where computational modelling can be carried out relative to human cells rather than laboratory animals - this is extremely important to the cosmetic industry given the EU-wide ban that is now in place. The public at large will also benefit from the enhanced metabolic reconstructions, as they will provide an avenue to develop personalised nutrition, exercise and other aspects of a healthy life. While many 'chassis' organisms are being developed for various aspects of industrial biotechnology, because of the tools available Saccharomyces cerevisiae and Escherichia coli will continue to play major roles. We thus intend to ensure that we drive developments in these organisms in particular. As well as these, the software that will be developed will allow researchers to create metabolic reconstructions of any organism of interest. The software will allow for distributed (online) network reconstruction jamborees, and may be used to coordinate community model construction efforts for those organisms. UK industries adopting the metabolic reconstructions will increase their research effectiveness and thus this will contribute to their competitiveness (such as the IPA partner Unilever). We note that this resource will be of special interest to the household products and cosmetic industries since they are now banned from using laboratory animals to develop their products and to test their effectiveness and potential toxicity. Improved accurate metabolic networks of a range of organisms are an invaluable resource for that purpose.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsPharmaceuticals, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeIndustrial Partnership Award (IPA)
terms and conditions of use (opens in new window)
export PDF file