Award details

Closing the gaps in metabolomics - Identifying unknown metabolites and mapping onto biochemical pathways

ReferenceBB/K004301/1
Principal Investigator / Supervisor Professor Christoph Steinbeck
Co-Investigators /
Co-Supervisors
Institution EMBL - European Bioinformatics Institute
DepartmentChemoinformatics and Metabolism
Funding typeResearch
Value (£) 118,874
StatusCompleted
TypeResearch Grant
Start date 01/12/2012
End date 30/04/2014
Duration17 months

Abstract

For many organisms of interest to biological research, only a fraction of the metabolome is known. Roughly more than half of all signals we routinely detect in a MS/MS metabolomics experiment are unknowns, and many metabolites are still not identifiable, a significant unsolved problem. We are proposing to develop a tool that will build on existing work in our group on I) cheminformatics-based structure elucidation of unknowns in organic chemistry, II) use existing reference databases of spectroscopically characterised compounds to assist with metabolite identification and III) bioinformatics reasoning used to assess the likelihood that a suggested structure fits into the biochemical repertoire of an organism metabolism and specific pathway, once the general chemical group has been identified. For point I) we are going to develop our existing infrastructure for the stochastic screening of chemical spaces for compounds with a given set of computable properties to the problem of structure elucidation in metabolomics. The very large spaces to be screened when little spectroscopic information is present, will be restrained through the use of a yet to be developed notion of biosynthetic accessibility within the organism, tissue or cell type under investigation. These abbreviated candidate spaces will then be mapped to metabolites appearing in known pathways through the development of biologically relevant spectroscopic, chemical and semantic similarity measures based on data mentioned under II) above. This tool will facilitate the integration of metabolomics fully into the rest of the bioinformatics analysis pipeline, supporting the identification of hypotheses for underlying disease mechanisms of action and pinpointing the mechanisms for individual differences in cellular phenotypes.

Summary

Metabolic profiling provides one of the most complete sources of information for pinpointing physiological conditions at a moment in time. It therefore provides a vehicle for understanding the cellular processes involved in both normal functioning and dysfunctions caused by systemic diseases, mapping the interactions of the organism with its environment as mediated by genetic factors. Identification of the small molecular metabolites in the measured samples is essential to facilitate downstream research into underlying mechanisms. Current techniques for analysis rely on the identification of known metabolites which are already mapped to reference metabolic networks and pathways, yet, such reference knowledge about the metabolome is far from complete. One of the major current challenges for the metabolomics/metabonomics community is the extensive and increasing number of unknown metabolites detected, as instrumental sensitivity, dynamic range, acquisition speed and mass accuracy increase exponentially. Knowledge about the structural identity of metabolites is essential to understand their interaction with their biological targets - an understanding that can then aid the development of methods to interact with biological processes in, for example, chemical biology. Identification of unknowns is a major unsolved problem in metabolomics, requiring time consuming further experimental work. We are proposing to develop a tool which addresses this gap, predicting unknown metabolites and projecting metabolic sample data containing knowns and "predicted unknowns" onto biochemical network and pathway knowledge bases. We will build on state of the art methods for structure elucidation and develop methods for bioinformatics reasoning about the likelihood that predicted structures fit into the biological context of the metabolic sample. This tool will be made available within the context of the BBSRC-funded cross-species, cross-platform MetaboLights database and repository formetabolomics experimental data, and will additionally be provided as open source, thus benefiting the broadest number of academic and industrial researchers. The work will be based on open data and will be published as open source software and open access publications, thus will be accessible to all interested parties.

Impact Summary

A tool for the prediction of unknown metabolites in the context of metabolomics experiments and placing sample data in the context of biochemical pathways will benefit a number of significant communities performing biological research and development. Metabolomics is of major importance for our understanding of how biological systems behave under various conditions and for developing personalized medicine and nutraceuticals because metabolites, as end products of cellular regulatory processes, provide insights into the actual functioning of biological systems as a combined response of their genetic and environmental factors as well as diseases. Analysis of metabolic data is also less invasive than other techniques as they can often be achieved by saliva or small samples of blood rather than whole tissues. The small molecule (metabolite) content of such body fluid samples can provide indicators as to the presence or absence of disease, to the functioning of biological processes at the cellular level; early warning signs in adverse reactions and more. Our proposal is in congruence with a number of strategic research priorities of the BBSRC: In systems approaches to biological research, metabolomics allows us to study how the metabolic system reacts to changes in the environments, to stress, to disease and other boundary conditions with high time resolution. Identification of metabolites is essential to understand their interaction with their biological targets - an understanding that can then aid the development of methods to interact with biological processes in, for example, chemical biology approaches. For ageing research, metabolomics is used to study and characterize states and dynamics of the ageing organism with no (urine) or low (blood) evasiveness, or through tissue analysis. Both in bioenergy research as well as in crop science, metabolite identification allows us to study how plants or microbes used for energy harvesting react to environmental changes (robustness) or how their energy metabolism reacts to genetic manipulation or other perturbations (flexibility). The complexity of metabolism in the plant kingdom makes this a particularly challenging area. Small molecule metabolism is currently also becoming a major emphasis for UK industry including the drug safety assessment process in the pharmaceutical industry, pesticide toxicology in agrochemicals, biomarker discovery for medical diagnostics and plant fitness for crop development. Metabolites are used a) as a diagnostic biomarkers and b) for classifying patients by their phenotype. In the public sector, the flourishing fields of translational medicine and chemical biology will benefit through information about which metabolites in which pathways in the human, animal or plant metabolism are affected and in which way. The scientific communities will be informed about the developments associated with this proposal through presentations (talks and posters) and workshops given by project members at scientific meetings and c) publications in peer reviewed journals and the member journals of the learned societies representing the communities mentioned above. As pointed out in the statement of data sharing, the software created in the course of this project will be fully open source and all the research will be conducted based on open data. All the results as well as the workflows and software themselves are therefore fully accessible and re-usable by the scientific community and secondary beneficiaries such as physicians or metabolic engineers. The European Bioinformatics Institute (EMBL-EBI) possesses a dedicated Outreach and Training department run by Dr Cath Brooksbanks and a team of eight co-workers. This department will coordinate a wide range of activities aimed at raising awareness about the proposed tool and associate activities of the EBI among potential users, our peers, our funders and the general public.
Committee Research Committee D (Molecules, cells and industrial biotechnology)
Research TopicsTechnology and Methods Development
Research PriorityTechnology Development for the Biosciences
Research Initiative Tools and Resources Development Fund (TRDF) [2006-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file