Award details

Unifying metabolome and proteome informatics

ReferenceBB/L018616/1
Principal Investigator / Supervisor Professor Andrew Dowsey
Co-Investigators /
Co-Supervisors
Professor Rainer Breitling, Dr Simon Rogers
Institution The University of Manchester
DepartmentMedical and Human Sciences
Funding typeResearch
Value (£) 144,291
StatusCompleted
TypeResearch Grant
Start date 01/07/2014
End date 18/01/2015
Duration7 months

Abstract

Metabolome and proteome informatics research has originated from different fields, yet their distinct perspectives have been applied to identical or similar problems. Cross-fertilisation of methodology and ideas has the prospect of seeding novel, effective new approaches of analysis. Because both fields attach differing focus to different stages of the pipeline, a unified pipeline will maximise potential of the whole workflow for both disciplines. To this end, we propose to bring together metabolome and proteome informatics by harnessing the prominent, open source mzMatch (metabolomics) and ProteoSuite (proteomics) packages as the central nexus to establish a unified informatics suite 'borrowing strength' in methodology advancements across both fields. The fundamental benefit will be statistically consistent and comparable metabolomics and proteomics data for optimised systems biology modelling. To attain this, we will: 1) Integrate mzMatch into ProteoSuite with unified data exchange and reporting. This will: promote synergy and researcher mobility between fields; facilitate teaching and learning of a common workflow and software; facilitate development of unified data standards through cohesive data sharing and re-use; enable an open API for community-centric development of unified informatics methodology. 2) Establish the unified informatics pipeline. The key is to use the same underlying statistical methodology for both types of omics, with analysis differing only in biological models utilised. To achieve this, we will develop novel: (a) integrated feature detection and isotope distribution modelling for metabolomics; (b) Bayesian mixture modelling for consensus identification and robust quantification in proteomics. 3) Bring together metabolome and proteome informatics communities. We will spearhead a programme of community involvement including an international one-day workshop, in order to foster a shared mind-set towards unifying the two fields.

Summary

Biologists are increasing wishing to understand the complex interactions between the building blocks of genes, metabolites and proteins that control the function of every living organism. The field of systems biology has emerged to overcome the deficiencies of the traditional reductionist approach, which has identified the building blocks themselves and many of the individual interactions but has not been able to deduce how systems of these blocks act and react in unison. The application of systems biology is widespread, as it promises to revolutionise our understanding of healthy processes in plants, animals and humans, as well as how they break down under disease and how this breakdown can be averted. Often the systems biology approach starts with a 'snapshot' of a particular biological sample. Mass spectrometry is a pervasive technique for gaining a snapshot of a sample, and it does this by ionising the sample and then measuring each constituent compound's mass and quantity based on the resulting charge. This is often not enough to separate out the sample fully and therefore a preceding phase of liquid or gas chromatography is used to provide an initial separation. Classes of protein and metabolite require different sample preparation, ionisation and chromatography approaches. These all add different kinds of biases and variation which make it extremely challenging to infer links between compounds, especially if the compounds are from different classes. To make matters worse, many snapshots are needed to capture different 'angles' of the biological process under investigation, and the instrumental conditions themselves are not entirely reproducible over time. All this has led systems biology to become a progressively computational discipline. The academic disciplines for studying global patterns of proteins ('proteomics') and metabolites ('metabolites') have broadly originated from different fields, and therefore there is little synergy between the two. This isalso the case for the computational aspect, despite the fact both are applied to mass spectrometry data. Cross-fertilisation of methodology and ideas therefore has the prospect of seeding novel, effective new approaches of analysis. The project team is involved in the development of the prominent mzMatch and ProteoSuite informatics packages for metabolomics and proteomics respectively. They are the most actively developed academic metabolome and proteome informatics packages in the UK. Therefore there is a timely opportunity to lead a concerted effort bringing together the informatics community, methodology and software for metabolomics and proteomics to: (a) Establish a new, powerful unified informatics workflow 'borrowing strength' in methodology advancements across both fields, greater than the sum of its parts and with coherent statistical properties enabling optimal integration into systems biology research; (b) Underpin cross-disciplinary collaborations, new understanding and mobility between metabolomics and proteomics fields; and (c) Support development of joint data exchange and reporting standards for optimal integration of metabolomics and proteomics data. To achieve this, we will first integrate mzMatch into ProteoSuite with unified data exchange and reporting. This will then enable the development of the novel unified informatics pipeline. The key is to use the same underlying statistical methodology for both types of omics, with analysis differing only in biological models utilised, thus underpinning coherent delivery to downstream systems biology modelling. We will also spearhead a programme of community involvement to encourage long-term community participation in the unified informatics approach. This will include an international one-day workshop drawing in leading groups from both metabolome and proteome informatics disciplines for the first time, in order to foster a shared mind-set towards unifying the two fields.

Impact Summary

As well as the academic beneficiaries, the proposed research has significant prospective impact for the mass spectrometry industry and associated proteomics and metabolomics vendors. The proposed unified informatics suite and pipeline will increase the amount of usable data extracted from LC-MS and therefore correspondingly increase users' return on investment. This will make commercial mass spectrometry instrumentation, which requires considerable capital and running costs, more attractive. In particular, we hope this extra research capacity will attract a wider uptake of mass spectrometry in environmental, biological and health research in industry and academia, as well as a wider audience of users and uses amongst systems biology researchers. The proposed unified informatics pipeline could be seen to be in competition with software products from vendors and instrument manufacturers, particularly Progenesis LC-MS and CoMet (Nonlinear Dynamics, Newcastle, UK. However, since our software is distributed with a permissive license allowing for its unrestricted re-use in other software packages, both free and commercial, we hope that our work will aid commercial software products similarly and therefore raise the bar for the whole field. There is considerable potential in this application for providing indirect benefits to UK public health, quality of life and environmental sustainability. Our aim is to establish, through cross-fertilisation between metabolome and proteome informatics, a powerful unified workflow with coherent statistical properties enabling optimal downstream integration into the systems biology paradigm. Due to its success and further substantial promise, the BBSRC, UK research councils and industry have invested greatly in the systems biology approach. The potential improvements yielded by our unified workflow will therefore have a clear dissemination route to the public through reduced resources, costs and overheads required for discoveries realised with systems approaches in environmental, biological and biomedical science, and the characterisation of those discoveries. The PDRA employed on this grant benefit significantly from exposure to the wealth of metabolome and proteome informatics expertise we will bring together in our proposed community-based initiative, particularly since the PDRA will be encouraged to play a significant role in public dissemination. They will also benefit from the uniquely concentrated cross-disciplinary interaction at CADET and the Manchester Institute of Biotechnology.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsTechnology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Tools and Resources Development Fund (TRDF) [2006-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file