Award details

ePhenotype - Visual Analytics for Integrated Large Scale Gene-Expression and Phenotype Data

ReferenceBB/N023307/1
Principal Investigator / Supervisor Professor Richard Baldock
Co-Investigators /
Co-Supervisors
Dr Chris Armit
Institution University of Edinburgh
DepartmentMRC Human Genetics Unit
Funding typeResearch
Value (£) 146,357
StatusCompleted
TypeResearch Grant
Start date 15/08/2016
End date 14/05/2017
Duration9 months

Abstract

We will develop an online interface to enable cross query and pattern matching between the EMAP embryo atlas and gene-expression database and the IMPC and DMDD knock-out mouse phenotype resources. The data will be spatially mapped by defining the complex transform between the two atlas frameworks and users will be able to use spatial patterns to query both resources. The results, especially the morphological variation associated with the phenotype screen, will be displayed in a web-browser based 3D visualisation tool that combines the capabilities of WebGL with the tiled views through very-large 3D image volumes provided by the Internet Image Protocol (IIP) extended by the Mouse Atlas Project to handle 3D data and arbitrary re-sectioning. The visualisation will implement novel extensions on the Javascript WebGL library three.js in order to render effectively the heat-map type scalar data but alsovector and tensor data associated with the morphological mappings. In addition we will extend the IIP3D protocol and servers to allow image manipulation and measurement of properties. For this we will implement an "image calculator" to allow segmentation, domain binary arithmetic (union, intersect, difference) morphological operations, filtering and feature measurements such as volume, area, distance and any of the Region Connection Calculus (RCC) operations. The calculator will also allow spatial domains found for example as the intersection of two gene patterns and perhaps a phenotype heat-map to be used as a query on the databases. By these means the scientist can browse and interactively explore the data for gene and phenotype associations. The output will be an open-source toolkit applicable to any 3D bioimaging application where the requirement is to visualise and explore very large image data-volumes. In addition we will deliver a functioning interoperable analysis tool for the large scale mouse embryo resources comprising the Mouse Atlas, IMPC and DMDD.

Summary

Modern biomedical research generates large volumes of data much of which is image based. Biological insight is often achieved by visualising, analysing and cross-comparing data from multiple sources. Several of resources are capturing image-data on a genomic scale i.e. for every known gene. Examples are the Edinburgh Mouse Atlas project EMAP, which hosts over 30,000 gene-expression patterns, mapped onto standard models of mouse development. Another is the International Mouse Phenotyping Consortium, which is capturing images of mice with one gene "knocked-out" (that is inactive) to help understand the gene function. This massive international effort will deliver such data for every known gene and will detect regions of the developing embryo that are growing abnormally and show statistically significant departure from normal. Here we will develop novel tools to integrate, query and co-visualise these data to enable scientists to browse and analyse the image data online without needing to download vast quantities of data. These tools will deliver a capability of "visual analytics" allowing interactive exploration and hypothesis generation from the "big data" and enable data comparisons and analysis not otherwise possible. This is now possible because of a number of key developments. First there are novel techniques for delivering image data and image-analysis via Internet (RESTful) services associated with each data resource and here we propose to extend these capabilities to include complex morphological data associated with deformation of biological tissues due to the knocked-out gene function. Secondly there are techniques for cross-mapping between resources that are each associated with an atlas. We will transform data from one resource atlas to the other and enable interoperable query, comparison and co-visualisation. Finally computer developments allow access to very large quantities of data very efficiently and make possible interactive image analysis even on multi-terabyte data volumes. In this project we will develop novel tools to allow the visualisation and data-analysis of complex 3D data associated with mouse embryo development within an interactive environment. This will require novel development of tools for online visualisation based on the WebGL standard and using the de facto standard web-browser programming language Javascript. This visualisation includes complex deformation data expressed as vector displacements or ultimately as the strain tensor, which captures all aspects of the deformation transformation. With these tools we are delivering new ways for scientists to browse and visualise the data and to interactively explore possible data associations and to refine and test hypotheses about gene interactions and their effect on embryo development. This will make the image data accessible in a way not previously possible and allow scientists to ask new questions that without these tools would be extremely difficult.

Impact Summary

Novel Toolkit for Large-Scale Image Visualisation and Image Analysis: ePhenotype will deliver a package of tools that enable resources of large-scale images (10GB-1TB) accessible for query and browse via a standard web-browser. The package includes the server-side system, an extended standard protocol and RESTful API, a client side Javascript library to deliver WebGL-based rendering of complex data and an image analysis tool to provide interactive visual analytics. This will have a wide impact on many biomedical research, translational and educational areas. Accessibility of Image Based Phenotype Data: The primary image data from the IMPC embryo phenotype screen is made freely available and there is a need for an easy-to-use web interface to allow researchers to archive, find, and query across large volumes of 3D image data. Making it possible to access and query mutant embryo image data easily, and circumventing the need for data download, will have a major impact on familiarizing the research community with phenotype image data. Who will benefit? The biomedical research community will benefit from the ability to access large volumes of image data easily, without the need for data download, and with easy access to anatomical atlas resources to assist in the identification of phenotypically abnormal structures. In a wider context, the ePhenotype toolkit can additionally be used as a community resource for archiving annotations from secondary phenotype screens involving re-use of IMPC embryo phenotype data. Data Integration: ePhenotype provides the mechanism by which images archived in phenotype and gene expression databases can be integrated. This is of critical importance because images hold information that can be mined using automated image analysis methods. By integrating these different image-based resources, we enable the ability to perform a co-query across multiple database resources. This makes possible the ability to discover novel associations between gene expression and phenotype data. What will be done to ensure they benefit? We will raise awareness of the ePhenotype toolkit through Exhibitor Stand demonstrations at national and international meetings. The eMouseAtlas Project has an Exhibitor Stand presence at the SDB and BSDB annual meetings and these present an opportunity for one-on-one tutorials on how to use the ePhenotype toolkit to maximal effect. In addition, we will raise awareness among the IMPC community through Prof Baldock and Dr Armit's presence at the annual IMPC Meeting. A publication describing the ePhenotype resource will be published in a high-profile journal and will be further publicised through postings on The Node. The software tools will all be open source from the matech GutHub repository and presented at bioinformatics and bio-visualisation conferences and meetings as well as through publications.
Committee Research Committee A (Animal disease, health and welfare)
Research TopicsTechnology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Tools and Resources Development Fund (TRDF) [2006-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file