Award details

Automated de novo building of protein models into electron microscopy maps

ReferenceBB/P000517/1
Principal Investigator / Supervisor Professor Kevin Cowtan
Co-Investigators /
Co-Supervisors
Institution University of York
DepartmentChemistry
Funding typeResearch
Value (£) 259,887
StatusCompleted
TypeResearch Grant
Start date 06/02/2017
End date 05/02/2020
Duration36 months

Abstract

The aim of the project is to adapt and optimize X-ray crystallographic model building methods for effective application to the de novo building of high resolution cryo-electron microscopy maps, and to distribute the software to broad user community. We will achieve these results by following the same approach to software development which we have successfully applied to previous projects: - Firstly, a curated library of test data will be prepared for which the final structures are known. The use of test data in this way has proven critical in our previous work, as it allows every change to the software to be evaluated across a representative selection of datasets, avoiding the problem of software which only works on a single structure. It is envisaged that a full set of tests will be run on a daily to weekly time scale, depending on the time requirements. If necessary more frequent tests will be run on a subset of the data. - The software will be developed in the CCP4 source code repository, providing a full history of changes to the software. The current version of the software will be recorded against each set of test results to document the effectiveness of each change to the code. The source code repository is publicly visible. - The methods used will be those developed as part of the BUCCANEER, NAUTILUS and COOT software packages, modified to provide best results against the test data. - Release versions of the software will be incorporated into the CCP4 and CCP-EM build frameworks to allow the software to be built for Windows, Mac and Linux computer systems. - The software will be incorporated into automate software pipelines and presented to the user through the standard CCP4 and CCP-EM user interfaces. - The software will be incorporated into the standard CCP4 and CCP-EM installation and update tools to allow installation by normal users without the assistance of a system manager.

Summary

Scientists are interested in the atomic structure of biological molecules, in other words what the molecules look like. Knowing in detail what a molecule looks like provides important clues to how it might work. If we can go further and capture molecules in the process of interacting with other biological molecules, or artificial compounds such as drugs, we get a clearer picture of how they work. Most of our knowledge of the structure of biological molecules comes from X-ray crystallography. However over the past decade a new technique, electron microscopy (EM) has become popular. Individual molecules held in a thin film of liquid solvent are frozen and placed in an electron microscope, which captures images of the molecules. Many individual views can be combined to construct a model of the structure of the molecule in 3 dimensions. Until recently these images were of limited resolution - they were 'fuzzy' - and so individual groups of atoms could not be seen. The EM user therefore needed to have some knowledge of the structure of the molecule, or at least parts of it, in advance. These fragments can then be fitted into the EM image to give an indication of the whole structure, and allowed large molecular machines such as the Ribosome to be understood. New electron detectors have allowed EM images to be determined at much higher resolutions, so that small groups of atoms can be distinguished. The resulting images are of similar quality to those from X-ray crystallography. This has allowed the atomic structure of the molecule to be determined without any prior knowledge of the structure in favourable cases. However at the moment the process of interpreting the map in terms of atomic features is often performed manually, at a cost of considerable effort and a potential lack of objectivity in the results. The aim of this project is to take an existing method for automatically building atomic models into images from X-ray crystallography, and modify the software towork effectively with the images from electron microscopy. Not only will this make the process of building an atomic model into an electron microscopy image much less time consuming, it will allow multiple models to be built into different images of the molecule as an assessment of the accuracy and reliability of the results. It will be possible to go back and check existing structures by rebuilding the maps automatically. This will provide a useful check on the quality of existing models determined from EM images. The project involves modifying existing computer software for building atomic models to adapt it to work on a new type of image. The software is already good at interpreting crystallographic images at the kind of resolutions produced by electron microscopy experiments, but works less well with EM images because it has been "trained" to work with crystallography images. Some retraining, and possibly some new methods, will be required. All of the software produced by the project will be distributed freely to academic users through existing software suites for crystallography and electron microscopy. The source code for software will also be distributed so that other developers can learn from it or modify it.

Impact Summary

Cryo-electron microscopy has progressed over the last decade from being a niche technique for the low resolution imaging of large complexes, to a comparatively routine technique suitable for solving most medium to large structures. The resolution of the best EM reconstructions are now sufficient for de novo building. The UK now boasts a number of large EM facilities, for example at the MRC, Diamond and Leeds. While the capital cost of an EM facility is large, the method offers substantial benefits for certain classes of problem. Unlike X-ray crystallography, the sample does not need to be crystallized - a time consuming and sometimes unsuccessful step. The challenge of crystallisation introduces a risk in the crystallographic pathway which carries its own cost. Consequently EM will see increasing use in the biotech and pharmaceutical industries. EM methods have the further benefit of imaging molecules in a state which is undistorted by crystal contacts, and thus in some cases more informative for biological problems. CCP4 has been very successful in serving the biotech and pharmaceutical industries, as evidenced by over a hundred annual software licenses issued to industrial customers, raising typically £1m per annum in income. CCP-EM seeks to fill the same role for EM users. Dr Cowtan's work has contributed significantly to the success of CCP4, with his contributions to density modification, model building, visualisation and supporting infrastructure attracting over 10,000 citations in the peer-reviewed literature, as well as being cited in patents. The development of de novo model building software specialised to the interpretation of EM maps, and their contribution to the CCP-EM software suite for EM structure solution will make the method more effective and make de novo structure solution more accessible to the typical user. The validation and automated rebuilding of existing models will reduce bias and improve the quality of the structures in the EM database. The direct benefits to industry are expected, by parallel with X-ray crystallography, to be realised through the development of new drugs and biochemical processes, building on the insights arising from the structures determined by these methods. However, as with X-ray crystallography, we expect that linking individual products to software developments will be difficult due to the closed nature of the sector. The primary indicator of impact will remain the license fees which industrial users are willing to pay for the software. Finally, the theoretical work which underlies this proposal will improve our understanding of the features of EM electron density reconstructions, and provide a basis for other developers to address the same problems in different ways. We therefore expect that our work will provide a catalyst for an expansion of the development of software for the later stages of EM structure solution, in particular model building and refinement. UK leadership in this area will provide a competitive advantage to our users and partners in UK industry.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsStructural Biology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file