Award details

CCP4 Advanced integrated approaches to macromolecular structure determination

ReferenceBB/S005099/1
Principal Investigator / Supervisor Professor Kevin Cowtan
Co-Investigators /
Co-Supervisors
Dr Jon Agirre
Institution University of York
DepartmentChemistry
Funding typeResearch
Value (£) 340,384
StatusCurrent
TypeResearch Grant
Start date 01/04/2019
End date 31/03/2024
Duration60 months

Abstract

This proposal incorporates four related work packages. In WP1 we will expand on our work using established and novel metrics of data quality and consistency to quantify the relationship between diffraction and map quality. The tools will be used to optimise approaches to structure determination from multiple or serial crystallography data to enable optimal selection of collected data and fully utilise all the information in structural refinement. WP1 will also develop and implement methods for electron diffraction data collection, integration and refinement. WP2 will utilise generalise the use shift field refinement and extend its usage to hybrid refinement approaches and develop new software libraries to enhance and speed up protein structure model building and refinement across a wide resolution range. In WP3 we will develop and implement the use of contact prediction methods for use in crystallography. It will help identify protein domain boundaries, define new search model approaches. The contact prediction approach will also be used to validate Molecular replacement solutions and assist in the interpretation of crystallographically derived protein:protein contacts. In WP4 we will develop a model for electron scatter from macromolecular samples to enable software development and experimental design. These models will be used to develop and implement new scaling algorithms for electron diffraction data within DIALS.

Summary

Proteins, DNA and RNA are the active machines of the cells which make up living organisms, and are collectively known as macromolecules. They carry out all of the functions that sustain life, from metabolism through replication to the exchange of information between a cell and its environment. They are coded for by a 'blueprint' in the form of the DNA sequence in the genome, which describes how to make them as linear strings of building blocks. In order to function, however, most macromolecules fold into a precise 3D structure, which in turn depends primarily on the sequence of building blocks from which they are made. Knowledge of the molecule's 3D structure allows us both to understand its function, and to design chemicals to interfere with it. Due to advances in molecular biology, a number of projects, including the Human Genome Project, have led to the determination of the complete DNA sequences of many organisms, from which we can now read the linear blueprints for many macromolecules. As yet, however, the 3D structure cannot be predicted from knowledge of the sequence alone. One way to "see" macromolecules, and so to determine their 3D structure, involves initially crystallising the molecule under investigation, and subsequently imaging it with suitable radiation. Macromolecules are too small to see with normal light, and so a different approach is required. With an optical microscope we cannot see objects which are smaller than the wavelength of light, roughly 1 millionth of a metre: Atoms are about 1000 times smaller than this. However X-rays have a wavelength about the same as the size of the atoms. For this reason, in order to resolve the atomic detail of macromolecular structure, we image them with X-rays rather than with visible light. The process of imaging the structures of macromolecules that have been crystallised is known as X-ray crystallography. X- ray crystallography is like using a microscope to magnify objects that are too small to be seen with visible light. Unfortunately X-ray crystallography is complicated because, unlike a microscope, there is no lens system for X-rays and so additional information and complex computation are required to reconstruct the final image. This information may come from known protein structures using the Molecular Replacement (MR) method, or from other sources including Electron Microscopy (EM). Once the structure is known, it is easier to pinpoint how macromolecules contribute to the living cellular machinery. Pharmaceutical research uses this as the basis for designing drugs to turn the molecules on or off when required. Drugs are designed to interact with the target molecule to either block or promote the chemical processes which they perform within the body. Other applications include protein engineering and carbohydrate engineering. The aim of this project is to improve the key computational tools needed to extract a 3D structure from X-ray and electron diffraction experiments. It will provide continuing support to a Collaborative Computing Project (CCP4 first established in 1979), which has become one of the leading sources of software for this task. The project will help efficient and effective use to be made of the synchrotrons that make the X-rays that are used in most crystallographic experiments but also extend to use of electron microscopes which have gained much recent publicity with the Nobel prize being awarded to researchers from this field. It will provide more powerful tools to allow users to exploit information from known protein structures when the match to the unknown structure is very poor. Finally, it will allow structures to be solved, even when poor quality and very small crystals are obtained.

Impact Summary

The generic importance of macromolecular crystallography in general and CCP4 in particular is provided in the Pathways to Impacts section. CCP4 users in the pharmaceutical and biotechnology sector are most often involved in the study of protein-ligand (most often drug) complexes. The critical computational step in this process is molecular replacement (MR), in which a known atomic model from a similar structure is used to explain the diffraction pattern of the unknown structure. The MR approach is used in more than 70% of structure solutions. However it is not uncommon for the molecular replacement to yield a poor electron density map due to changes in the conformation of the protein. The software developed in this work package aims to significantly reduce the number of cases in which problems occur by increasing the range of convergence of the initial refinement of the MR model, while dramatically increasing the speed of the refinement step to allow screening many more candidate models. Cryo-Electron Microscopy (EM) is an increasingly important method for the determination of the structure of pathogens and complexes. The same methods will also be implemented for cryo-EM data, where the resolution tolerance of the methods will facilitate the interpretation of lower resolution reconstructions. Improvement of the protein model also improves the electron density for the unmodelled ligand or drug, since the electron density features of the known and unknown regions of the structure are related through the diffraction pattern. The speed and radius of convergence of the new method will increase the coverage of automated methods for high throughput screening, which are widely used in the commercial sector. The impact of these developments will be to reduce the number of cases where structure solutions fails, to reduce the level of manual intervention required in successful studies, and to increase the accuracy of the resulting structures. YSBL has played a significantrole in the commercial impact of CCP4: two YSBL-originated developments (the REFMAC and COOT software) have been the most-used tools in their field. Several other YSBL developments (DM, MOLREP, BUCCANEER, CCP4I) have citation counts in the hundreds to thousands are are significantly used in industry. The YSBL group engage with commercial customers through through commercial representation on the CCP4 Executive Committee and Working Groups 1 and 2, through workshops and the CCP4 bulletin board. CCP4 developers including the York group. The working groups provide guidance on which strategic planning is built. The software produced will be added to the CCP4 suite, and where appropriate to the related CCP-EM software suite for electron microscopy. The CCP4 suite is in use world-wide and is available on Windows, Linux and Mac_OS platforms, providing a direct distribution channel to the overwhelming majority of macromolecular crystallographers. Libraries and methods will be available to other packages as well. CCP4 is updated with major version releases roughly every year, and automated updates on a roughly monthly basis to enable fast access to new developments. As a result, once the software has been added to the package it will within months be available to both the academic and commercial user community.
Committee Research Committee D (Molecules, cells and industrial biotechnology)
Research TopicsStructural Biology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file