Award details

MIDAS - Molecular Interaction Data Availability Standards

ReferenceBB/L024179/1
Principal Investigator / Supervisor Mr Henning Hermjakob
Co-Investigators /
Co-Supervisors
Professor Gos Micklem, Mrs Sandra Orchard
Institution EMBL - European Bioinformatics Institute
DepartmentProteomics Services Team
Funding typeResearch
Value (£) 604,922
StatusCompleted
TypeResearch Grant
Start date 01/07/2014
End date 31/12/2017
Duration42 months

Abstract

Molecular interaction Information is a key resource in modern biomedical research. The PSI-MI XML2.5 and MITAB2.5 formats were developed by interaction data producers and providers from both the academic and commercial sector to enable the description of interactions between a wider range of molecular types. Both formats have been widely adopted and resulted in a raft of accompanying tools and webservices being developed. However, new use cases have arisen that the format cannot properly accommodate. PSI-MI XML3.0 will be written to capture both more challenging experimental data, such as dynamic data and causal interactions and also knowledge abstracted from such data, for example a description of protein complexes. A Java framework will be designed to parse and write all versions of MITAB and PSI-MI XML and load the objects in a common framework. This framework will ease subsequent tool development both within this grant but also by external groups. A tool suite to enable the handling of large datasets will be produced on the Java interface, improved graphical representation modules will be developed and the PSICQUIC webservice will be upgraded and improved. These tools will be incorporated in two well-used UK resources, the IntAct molecular interaction database and the InterMine data integration platform. Tutorial and training materials will be produced and a series of workshops organized to disseminate information about these new tools and resources. Finally support will be provided for the work of the IMEx Consortium, a group of databases which cooperate to produce a high-quality, non-redundant set of interaction data, the raw materials required for large-scale data analysis and the building blocks of modern Systems Biology.

Summary

An understanding of the molecular interactions a cell makes is critical to understanding the biology of that cell, and the mechanisms by which it reacts to a change in its surrounding environment. Biologists have studied these interactions, mainly protein-protein but now extending to other molecule types, for many years and databases were created to store such information. These databases have been united through the work of the Proteomics Standards Initiative (PSI), which produced common interchange standards that were adopted by all the major players in the field. It was now possible for researchers in the field to combine datasets from disparate resources. Additionally many tools were written to visualize and analyze such data and these tools worked for data from multiple different sources. The data formats have been stable since 2007 years but experimental methodologies, data types studies and the complexity of the resulting data has moved on. It is now necessary to advance the standards to meet new challenges and to write novel tools, or update successful existing applications, to work with these upgraded formats. The developers will work as part of the PSI, in consultation with data producers, data users and tool developers to ensure the updated formats meet their needs. Once developed, the new formats and accompanying tool suite and visualization resources will be incorporated into two high profile, UK-based resources - the IntAct molecular interaction data abase and the InterMine project which provides an open-source data integraion platform for several important model organisms. A training program will be delivered to ensure the outputs of this grant are understood and used by the Systems Biology community and the work of the IMEx Consortium, responsible for supplying a high-quality non-redundant set of interaction data for network biologists, will also be supported.

Impact Summary

1. As already described under 'Academic Beneficiaries, one of the major groups who will profit by this work are large-scale data producers performing network analysis on large datasets. These include pharmaceutical companies and SMEs who map protein networks to disease and looking for drugs which disrupt these networks. These companies will not only benefit from the improved tools and formats and the new API but are interested in the PSI-MI developing the ability to describe causal interactions, as described in WP2. Overlaying pathway data with molecular interaction networks, using the much improved PSICQUIC XML webservice, will additionally enable target identification beyond the currently understood linear pathways 2. Indirect beneficiaries will be any researcher in the fields of basic biology or biomedicine as network biology continues to contribute to our understanding of the processes within a living cell. Improved access to data and tools to utilise this data 3. Funders, such as the Research Councils, will benefit from the increased impact of the projects they support, as the Editorial tool will make it easier to deposit interaction data into the public domain repositories and therefore available for reuse. 4. The IMEx Consortium is actively encouraging the direct deposition of interaction data into databases such as IntAct in the UK, thus ensuring that publicly funded data is not lost, but rather adds to the corpus of information available to the biological community. The accession numbers issued by this consortium enables granting bodies to monitor data sharing compliance of applicants. The UK currently plays a leadership role in this consortium which plays a major role on the global interaction field, and is of critical importance to both our industry and research scientists. The two-way interchange of information with these groups at meetings, forums and workshops is of value to us all. 5. Staff employed will benefit from exposure to numerous international collaborations, through the PSI, new collaborations with both research groups and industry, particularly in relation to the shared development of software and training in software development and implementation.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsX – not assigned to a current Research Topic
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file