Award details

Development of network analysis tool BioLayout Express3D

ReferenceBB/F003722/1
Principal Investigator / Supervisor Professor Tom Freeman
Co-Investigators /
Co-Supervisors
Institution University of Edinburgh
DepartmentGenomic Technology and Informatics
Funding typeResearch
Value (£) 101,325
StatusCompleted
TypeResearch Grant
Start date 01/01/2008
End date 28/02/2009
Duration14 months

Abstract

Conventional analysis techniques are generally pair-wise where an individual relationship between two biological entities is studied without considering higher-order interactions with their neighbours. Graph and network analysis techniques allow the exploration of the position of a biological entity in the context of its local neighbourhood in the graph and the network as a whole. BioLayout Express3D has evolved from a program called BioLayout. BioLayout was originally written as a general approach for the representation and analysis of relatively small networks of various types and complexity. BioLayout Express3D is the product of an 18 month programme of extension and modification of the core BioLayout system specifically, but not exclusively, so as to facilitate a new approach to the analysis of microarray expression data. These improvements include: 1. Built in probe to probe Pearson correlation calculation and storage 2. Built in network building and clustering for gene-expression data 3. Layout of large graphs 4. Highly optimised routines for layout and correlation calculation 5. 3D visualisation of network graphs 6. Input of multiple annotation classes 7. Implementation of Markov clustering routine (MCL) 8. Expression profile viewer for single and dual colour analyses 9. Class annotation viewer BioLayout Express3D is entirely written in Java and is portable as a jar file across the Windows, Mac, Linux and other operating systems. We believe that this new approach to represents a significant advance on previous analytical techniques for microarray data. The approach we have taken is novel and overcomes some of the intrinsic problems associated with the visualisation and clustering expression data (and large network graphs derived from other data sources). Together, we believe we have developed an approach and tool that potentially will have wide application to microarray data and beyond.

Summary

Enormous amounts of data pertaining to the functions of genes and proteins and their interactions in the cell, have now been generated by a range of techniques including but not limited to: expression profiling, mass spectrometry, RNAi and Y2H assays. Such functional genomics and proteomics approaches, when combined with computational biology and the emerging discipline of systems biology, finally allow us to begin comprehensive mapping of cellular and molecular networks and pathways. One of the main difficulties we currently face is how best to integrate these disparate data sources and use them to better understand biological systems. Visualisation and analysis of biological data as networks is becoming an increasingly important approach to explore a variety of biological relationships. Such approaches have already been used successfully in the study of sequence similarity, protein structure, protein interactions and evolution. Shifting biological data into a graph/network paradigm allows one to utilise algorithms, techniques, ideas and statistics previously developed in graph theory, engineering, computer science and computational systems biology. In networks derived from biological data, nodes are usually genes, transcripts or proteins, while edges tend to represent experimentally determined similarities or functional linkages between them. While network analysis of biological data has shown great promise, little attention has been paid to microarray data. These data are now abundant, generally of high quality and consist of the type of high-dimensional data for which such approaches are well suited. We have developed a new program called BioLayout Express3D that constructs networks out of microarray expression data. This is achieved by measuring the similarity between individual gene expression profiles and where similar i.e. above a defined threshold, a line is used to connect them. In circumstances where there are groups of co-expressed genes within agiven dataset, these nodes form a clique of interconnected nodes. Given the complexity of the data from modern array platforms tools that provide a means of visualising and analysing large amounts of data are very much needed. The current version of BioLayout Express3D can construct graphs composing of over 10K nodes and 1M edges. Visual representation of the graphs is enhanced by a unique layout algorithm combined with an OpenGL graphics engine that renders the network graphs in 3-D space. The layout data in this manner has a number of distinct advantages. The position of each node (gene) within the network can be determined relative to its immediate neighbours i.e. genes that are closest in expression (share edges) to that selected. This visualisation also allows the user to quickly identify structures and features in the graph by eye that would not have been obvious previously. Definition of these structures has also been enhanced by a graph-based clustering algorithm (MCL). Using this approach, large graphs can be divided in groups of highly connected nodes or expression data clusters of co-expressed genes. Having now looked at numerous 1- and 2-colour microarray expression datasets varying in size from less than 20 chips to over 200, we are very happy with the basic performance of the tool. However, we urgently need to add features that will extend its analytical capabilities. The other area in which this Biolayout Express3D is likely to play an important role is in modelling other types of biological relationships. In particular we have begun to use this tool construct graphs based on relationships in protein similarities and in particular networks based on large-scale interaction and pathway datasets. In this respect the tool is showing great promise over other available software packages, but again the tool is in need of further development to enhance its capabilities in this area.
Committee Closed Committee - Biomolecular Sciences (BMS)
Research TopicsTechnology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Tools and Resources Development Fund (TRDF) [2006-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file