Award details

Structural data-mining of high-resolution 3DEM maps in EMDB

ReferenceBB/P026893/1
Principal Investigator / Supervisor Dr Ardan Patwardhan
Co-Investigators /
Co-Supervisors
Professor Gerard Kleywegt, Dr Garib Murshudov
Institution EMBL - European Bioinformatics Institute
DepartmentProtein Data Bank in Europe
Funding typeResearch
Value (£) 150,481
StatusCompleted
TypeResearch Grant
Start date 01/02/2018
End date 31/01/2019
Duration12 months

Abstract

The proposed research aims to develop methods and corresponding software for selection and validation of a set of reliable, high-resolution cryo-EM maps and corresponding atomic models, segment maps with the help of atomic coordinates and identify sections of the maps corresponding to small fragments. Segmented map fragments will be aligned and resolution-dependent classification will be carried out. Developed methods and derived data will be implemented in user-friendly software tools. The main methods to be used are advanced tools from modern statistics as well as from image processing and analysis. These methods include multivariate data analysis including multi-dimensional scaling, principal component analysis, alignment in rotational groups (special orthogonal groups in three-dimension - SO3 groups), procrustes analysis and classification techniques. The objectives of the proposed project are: 1) To develop methods for selecting EMDB entries with fitted atomic models, segmenting the EM density using the fitted model as a guide and aligning, classifying and averaging the segmented densities to obtain 3D density-motif libraries. 2) To explore variations in motifs as a function of resolution and environment to better understand the information content of EM structures. 3) To exploit the motif libraries to develop validation metrics based on the comparison of atomic models and motif densities. 4) To develop a production process to periodically update motif libraries as new entries are added to EMDB, and to provide versioned, open, public access to the motif libraries.

Summary

Almost all biological processes in living organisms are carried out by biomacromolecules such as proteins and nucleic acids. The study of three-dimensional structures of these macromolecules is an essential step in understanding fundamental biological processes at atomic level. Availability of these structures also helps in understanding disease processes and facilitates designing new drugs to fight them. There are three main techniques to determine the three-dimensional structures of biomacromolecules: X-ray crystallography, Nuclear Magnetic Resonance spectroscopy (NMR) and single-particle cryo-electron microscopy (cryo-EM). Recent technical advances in cryo-EM have led to a large increase in the use of this technique, and it may well become the method of choice for structural biologist in academia and possibly the pharmaceutical industry. Moreover, large macromolecular complexes can only be studied using cryo-EM techniques. Unfortunately, methods for validation and reliability analysis of derived atomic models and maps are lagging behind. Our research proposal aims to tackle some of the aspects of the validation, namely what are the structural details that can be reliably interpreted in a given map. Apart from helping to understand structural details that can be derived from a given map the results of this research will also help to improve derived atomic models.

Impact Summary

Academic impact 1) The developed resolution metrics based on the density motif libraries will help the EM community better assess the quality of EM structures they produce and download from the EMDB. 2) The developed resolution metrics will make it easier for the wider biological community to interpret EM structures. 3) The developed side-chain density libraries will lead to a better understanding of the density in EM maps and the variation with local environment. This will be important for efforts aimed at driving the field to higher resolutions, for developers of software for building atomic models into EM maps, and for physicists and chemists interested in improved understanding of electron - soft matter interaction at the atomic level. 4) The model-guided segmentation, alignment and classification software that will be developed in the project has a broader applicability beyond this project, for instance, for developing density-motif libraries at the domain level and in other 3D image processing applications. 5) Training of a scientific programmer (SP) at EBI and a post-doc (PDRA) at LMB. 6) Training of PhD students and post-docs in the structural biology community. 7) Imagery based on the density-motif libraries could be exploited in text books and resources aimed at a more general audience such as undergraduate students and bioinformaticians (e.g., Arthur Lesk's "Introduction to Protein Science: Architecture, Function and Genomics", 3rd Edition, Oxford University Press 2016 has been very instructive in conveying to a broader audience what X-ray density looks like at different resolutions). Economic and societal impacts 1) Users from public, private, and third sector organisations will accrue the same benefits of the results of this project as academic users by virtue of the open access to software, motif libraries and publications, but further interactions will be needed to better understand specific issues relating to area-specific applications. 2) The colour-coded imagery of structures based on resolution metrics can be aesthetically pleasing and can be exploited for the presentation of structural data to a broader audience.
Committee Not funded via Committee
Research TopicsStructural Biology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Tools and Resources Development Fund (TRDF) [2006-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file