Award details

PhenoImageShare - a phenotype image annotation, sharing and discovery platform

ReferenceBB/K020153/1
Principal Investigator / Supervisor Dr Helen Parkinson
Co-Investigators /
Co-Supervisors
Dr Gautier Koscielny
Institution EMBL - European Bioinformatics Institute
DepartmentMicroarray Group
Funding typeResearch
Value (£) 160,622
StatusCompleted
TypeResearch Grant
Start date 01/08/2013
End date 31/12/2016
Duration41 months

Abstract

Bio-imaging is key to observing and quantifying morphological and histological phenotype. There are major sets of image data capturing 3D and high-res histology images for adult and embryo phenotype. In general the phenotype resources may be searched individually, but there is no mechanism for integration, cross-query and analysis, especially with respect to human abnormality and disease phenotype. Furthermore the annotations will typically be traits found by manual scanning, secondary analysis for subtle variation especially at the cellular level remains rather difficult. Finally none of the data will be in the context of a spatio-temporal framework for spatial analysis and interoperability with atlas-based resources such as the Allen Brain Atlas and eMouseAtlas. PhenoImageShare will provide a toolkit for complex image annotation, sharing, discovery and query from federated biological images supporting phenotype description. Feasibility will be demonstrated using high throughput phenotype images from KOMP2 and IMPC, histology images of adult tissues early post-natal (EUCOMMTOOLS) and embryo (EMAGE). Images will be a combination of 3D (OPT, uCT, uMR) and high-resolution histology sections. All tools will be applicable to other model systems. We will develop: 1. federated phenotype query capabilities across image archives; 2. standards for phenotype and spatial annotation with server technology for interoperability; 3. interfaces to annotate phenotype with standard ontologies and within image and atlas spatial markup; 4. a referencing protocol ("data-track") for spatial queries and visualisation with respect to atlas frameworks; 5. plug-in to an open image-archiving system (OME) enabling lab-publication of annotations to be integrated via the central DB. The use-cases include demonstrating federated query of multiple image sources using the ontology-based annotation as well as direct spatial queries from a standard atlas framework such as eMouseAtlas.

Summary

Scientists have now completed the sequencing of the genome of many organisms including man, mouse, chicken and other vertebrates. The next major challenge is to understand the function of the genes and other parts of the genome in the living organism in normal, abnormal and diseased conditions. The observed variation in the living organism is called phenotype and the grand challenge in biological science is to understand how the variation of the genome of an individual influences and changes the observed phenotype. This will provide critical information to allow scientists to decode the complex processes that can lead to abnormal development and disease. There are many measures of phenotype, for example height or weight. More recently scientist are using 2D and 3D imaging techniques to allow different phenotypes to be observed including shape change, abnormal development (e.g. hole in the heart), but also at the level of the cellular arrangement and expression of genes. A challenge is to bring together all the pieces of information about phenotype across databases of many hundred of thousands of images so that scientists can look for patterns that can reveal common and related causes. This project will develop software and databases that will allow the disparate image databases to be integrated in terms of the phenotypes that have been observed for each image. We will develop a central database and web-portal that will allow scientists to search all the contributing image archives to find images that relate to particular conditions and diseases. The tools will include web-based interfaces to generate a standardised version of the phenotype plus spatial or location annotation (rather like the flags on Google maps) that allow queries to relate to parts of the body. The software will allow individual laboratories or large consortia to publish their phenotype annotations in a way that is integrated with others. This federation of databases allows cross querying of large image archives without having to bring data together in one place which would be very difficult to fund and maintain given the data volumes that are now collected. The PhenoImageShare resource will provide the means to integrate the phenotype images in many resources and allow scientists to be able to search and mine for associations across all the studies small and large that might be relevant. This will allow faster access to relevant data and minimise use of animals by reducing re-experimentation that can arise simply because data could not be found.

Impact Summary

As reference genomes and large scale programmes to generate mutants and knock-outs are completed there has been matching effort to establish and codify phenotype with genomic coverage. In parallel Bio-imaging is emerging as a primary mechanism to observe and quantify morphological and histological phenotype in mouse embryos and adults. There are also other major collections of image data, for example from the Sanger Institute's Zebrafish phenotype screens and plant screens. Current phenotyping effort will deliver annotations held in independent databases associated with the primary data, which may be searched individually, but there is no mechanism for integration, cross-query and analysis, especially with respect to human abnormality and disease phenotypes. Furthermore the image annotations will be obvious traits by manual scanning but will not include or allow deeper investigation for more subtle variation especially at the cellular level. Finally current data will not be published in the context of a common spatio-temporal framework allowing more complex analysis and interoperability with other atlas-based resources such as the Allen Brain Atlas and eMouseAtlas gene expression databases. PhenoImageShare addresses the need to locate, share annotation, map to spatial references, provide semantic and spatial queries between images and map these to image and genomic/transcriptomic objects for co-query. Mouse images will be used within the project as these are an excellent set for prototyping and freely available, but the technology and toolkit are accessible to any phenotype images from any species. The ability to index and share annotation on existing images will be delivered early, with latter stages devoted to spatial queries for complex images. Technology will be delivered to the community at low granularity image tagging, while working towards integration with reference atlases. PhenoImageShare beneficiaries: - Biological researchers: will be able to access andquery phenotype (images underlying) in context of genome/transcriptome, and be able to make spatial queries between large image sets. - Bioinformatics resource providers: benefit as they will be able to access the annotations on images from the annotation server, register their own images, access annotation and viewing tools for images without importing all the images. - Image collection owners: will be able to provide their image annotation and location using a standard protocol realising low cost provision of access to public images. Image queries will be provided at a range of granularities, allowing collection owners to iterate and provide increasingly rich data as their images are annotated as projects progress. - Publishers: will be able to provide image annotations related to papers. - Translational researchers in industry and academia: the images in question support cellular phenotyping, complex phenotypes, many of the image subjects are models for disease research. - Funders: images archived, well annotated, shared and re-used, maximising value for research spend and researcher effort as well as promoting data sharing. - Image analysis experts: a searchable corpus of images on which to develop methodology. - Scientists in training: seeing images in the context of a reference atlas is an excellent way to learn. - Owners or funders of reference atlases: spatial queries vs. a reference promotes use of the reference set and the precision and granularity of image annotation, providing a sliding scale of tagging to precise spatial query. - This project supports the three R's benefiting animal welfare, as images are shared, promoting use of existing animals and related data, not generation of new images. - Pharma and SMEs both consume images from the research sector, and generate their own images, e.g. during cellular phenotyping. The resulting toolkit will promote access to research images, and allow sharing of internal images across sites.
Committee Not funded via Committee
Research TopicsTechnology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file