Award details

Expanded Metadata Support in the Open Microscopy Environment's Bio-Formats & OMERO Data Applications

ReferenceBB/L024233/1
Principal Investigator / Supervisor Professor Jason Swedlow
Co-Investigators /
Co-Supervisors
Institution University of Dundee
DepartmentSchool of Life Sciences
Funding typeResearch
Value (£) 551,008
StatusCompleted
TypeResearch Grant
Start date 31/08/2014
End date 28/02/2018
Duration42 months

Abstract

OME's OMERO platform (http://openmicroscopy.org/site/products/omero) includes server and client applications that combine an image metadata database, a binary image data repository and high performance visualization and analysis. A permissions system controls access to data within OMERO and enables sharing of data with users in a specific group or even publishing of image data to the worldwide community. Using these facilities, OMERO and Bio-Formats provide data access and management facilities to hundreds of laboratories worldwide, and several on-line scientific image publication systems, (e.g., http://emdatabank.org/ & http://jcb-dataviewer.rupress.org). In this project, we aim to extend OME's metadata capabilities: 1. OME-TIFF and Bio-Formats are currently used by 1000's of labs and many commercial imaging companies, for transporting image acquisition metadata between software tools. In this project we will extend a completed draft "region-of-interest" (ROI) specification to include a comprehensive set of multi-dimensional shapes. This ROI specification will be incorporated into OME-TIFF and Bio-Formats and specifically used to pass and transport calculated regions between software tools. 2. OMERO users are increasingly adding extra, non-image data to images stored in OMERO that capture associated analytic results, annotations and other experimental outputs. OMERO currently supports these data (e.g., PDF, XML, .xls, etc.) as "Structured Annotations", allowing indexing of any text, and a unique namespace for access and recovery. Currently, these files are stored on a filesystem, accessed for download through the OMERO API. As these data grow in size and complexity, better storage, retrieval and mining are required. In this project, we will extend the sophistication of OMERO, extending its NoSQL support to these image-associated documents, to ensure that all associated data can be properly stored, indexed, mined, shared, and retrieved.

Summary

Biological microscopy has always involved "imaging": images were initially hand drawn and with the advent of light-sensitive film, recorded and then reproduced on paper. These methods distorted the relationships between the signals they recorded (formally, they are "non-linear media"), making it difficult to use them for scientific measurements. However, the application of digital detectors to microscopy delivered "linear" measurements suitable for scientific use. This, combined with automation, spawned massive growth in the number and diversity of uses for digital imaging in basic and clinical research. Each platform produces many GBytes of data, usually in a closed, proprietary file format. These are powerful systems, but their full utility is limited by closed data and the difficulty of viewing and sharing large datasets on standard desktop computers. The Open Microscopy Environment (OME) has built open software tools that enable access, analysis, viewing and sharing of this data. Initially built for light microscopy, we have successfully extended these tools to electron microscopy, high content screening (used for drug discovery in pharmaceutical research) and digital pathology. This proposal seeks to extend the type of data that OME covers, specifically to support the output of analyses and measurements made on digital image data. All of OME's software and resources are open source, available on-line to anyone, and supported by a dedicated team that manages documentation and community outreach.

Impact Summary

The rise of quantitative biology has driven the generation of ever increasing stores of experimental data that are the foundation for biological research and discovery. Unfortunately, full exploitation of these data still remains unrealised. Data generated on commercial platforms are not stored in easily accessible formats and the size and complexity of these data makes routine analysis and sharing difficult. Collaborations depend on data sharing, but the transfer of complex, large datasets (>100 Gbytes is routine) between scientists, labs and/or software tools limits what can be achieved and is ultimately a barrier to scientific discovery. OME's goal is to provide interfaces that enable data exchange-- between different software tools and between geographically remote scientists. Currently, OME's Bio-Formats file translation library and OMERO data management platform enable: -- access to >125 scientific image file formats; -- management, analysis, and sharing of image data relevant to a diverse range of biological research topics; -- the foundation for the first on-line image publication facilities. OME's tools are used worldwide, in thousands of laboratories, across many different domains of biological research. OME's commitment to an open development process, where all planning, roadmapping, user support, and developed code are openly available has built an active community of users in academic, biotech and pharmaceutical research. Some simply use the software as is, but many see it as a platform upon which their own applications, defined by their research needs, can be built. OMERO is the foundation for PerkinElmer's Columbus data management system which now runs HCS data in most major pharmaceutical companies in the world. OMERO and Bio-Formats also power several on-line scientific image repositories, the largest of which is the JCB DataViewer (http://jcb-dataviewer.rupress.org). The data published at the JCB DataViewer and other public OMERO instances are being incorporated into two new data resources, the Cell Phenotype Database and BioStudy (see LoS from Alvis Brazma and Gabriella Rustici) being developed at the EMBL-EBI, as part of their ongoing efforts to deliver important datasets to the life sciences community. Thus, the impact of OME, and its future funding and activities enhance research and productivity in laboratories in the UK and around the world.
Committee Research Committee D (Molecules, cells and industrial biotechnology)
Research TopicsX – not assigned to a current Research Topic
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file