Award details

Ab initio protein modelling for automated X-ray crystal structure solution

ReferenceBB/H013652/1
Principal Investigator / Supervisor Dr Martyn Winn
Co-Investigators /
Co-Supervisors
Dr Ronan Keegan
Institution STFC - Laboratories
DepartmentComputational Science & Engineering
Funding typeResearch
Value (£) 33,348
StatusCompleted
TypeResearch Grant
Start date 01/08/2010
End date 31/07/2012
Duration24 months

Abstract

The overall objective of this work is to facilitate access by crystallographers to a novel source of effective search models for Molecular Replacement (MR), namely ab initio protein modeling. Search models will be obtained by automated clustering and processing methods that we will develop. To allow for maximum access by crystallographers, we will use polyalanine models that can be cheaply obtained on typical PCs. To this end, we will draw on our experience of successful MR with hand-picked ab initio models (Acta Cryst, 2008, D64:1288-1291) and discover novel, effective and automatic means of clustering and processing models. We will exploit the synergy that exists between ab initio modeling, which produces clusters of structurally similar models, often capturing correctly different elements of the true structure, and the demonstrably effective structure ensemble approach to MR. Further, we will explore means for automatic deletion of inaccurately modeled termini and benchmark alternative treatments of side chains. After identifying the most successful modes of model superposition and processing, we will incorporate a pipeline for search model production into the program MrBUMP. MrBUMP is an automated program of the popular CCP4 package, which creates and feeds search models to the successful MR programs Phaser and Molrep, and hence is the ideal vehicle for dissemination of our new protocol. Importantly, the changes to MrBUMP will also enable simple, automated local running of the ab initio modeling itself. We will focus on ROSETTA as the only presently widely available ab initio program but we will observe other developing packages and compare their performance as and when they reach maturity. Easy access to ab initio-derived search models will provide the crystallographer with a new tool applicable to both entire proteins and individual domains of up to 100-120 residues, the most common domain size.

Summary

Proteins make up the functional machinery of all living beings. Their particular roles depend on their 3-dimensional structures which allow given proteins to interact specifically with other molecules in their environment. Some proteins - enzymes - go further and can transform certain compounds into others. To understand better how proteins work and be able to use them in industry and medicine, scientists are greatly interested in figuring out their 3-dimensional structures. There are various ways to do this, but the dominant technique is X-ray crystallography. In this, an intense beam of X-rays is fired at a protein crystal. The X-rays are diffracted when passing through the crystal, producing a pattern of rays that is characteristic of the protein under study. In order to elucidate the structure of the protein, information derived from multiple diffraction patterns obtained from the same protein but under different conditions must be drawn together. The acquisition of such extra diffraction patterns can be time consuming, expensive, and commonly involves hazardous chemicals. A technique exists, however, where computers substitute the additional experiments by estimating equivalent information from available structures of proteins similar to that under study. In this way, protein structures can be solved from one single diffraction pattern. This technique - called Molecular Replacement (MR) - is fast, economical, clean and often uncomplicated. However, since MR relies on pre-existing structures, it is not applicable to many proteins of interest, for which similar structures are simply not available. For many years, scientists have tried to develop computer methods to predict the structure of proteins, purely based on their sequences. These methods are generally called ab initio modelling methods. Over the past decade, these efforts have started to bear fruit. These predicted models are unlikely to substitute for crystal structures any time soon since they typically contain errors, but recent work has shown that they are sometimes close enough to the real structure for them to be used in the MR process. This is the main idea behind this proposal - to adapt current ab initio modelling procedures to the specific needs of MR. With ab initio modelling, it is generally the case that the more detailed (i.e. the longer) the computer calculation, the better the model you can make. Unfortunately, achieving the best models is so demanding that it often requires extensive calculation times or access to supercomputers or other vast computer resources. Few crystallographers have access to these facilities, making the modelling method impractical. We therefore propose a different approach, making efficient use of simpler models that can be easily obtained on typical computers. In our preliminary work, we have already proven that this approach can work successfully for MR. What we want to do now is find the best way to produce optimal models and to do this automatically. This effectively means adapting the method to meet the demands of modern X-ray crystallography, making it fast so that it can be used as a routine approach and accessible to other crystallographers without specialist knowledge of ab initio modelling. We then want to include the method in the MrBUMP program, which is a well-established package allowing for easy, automated MR. MrBUMP can be added to a software package called CCP4i that is widely used by crystallographers. By incorporating our processing method in a familiar program, we expect it to become widely used across the world. We expect that by extending the MR computational approach we will enable protein structures to be determined more quickly and cheaply. In this way, research in all sorts of areas that depend on protein structure information, like drug design, will proceed faster.

Impact Summary

This proposal addresses the key phase problem that lies at the heart of protein crystallography. Exploitation of ab initio protein models in the computational solution of phases by MR will constitute another tool in the crystallographer's armory. This will enable, in applicable cases, to avoid the crystal- and reagent-expensive experimental approaches to structure solution. The ab initio/MR approach will eventually form part of fully automated, high-throughput elucidation pipelines like those of Structural Genomics projects. The methodology will also be capable of accelerating structure refinement in cases where part of the structure can readily be placed leaving a 'missing domain'. The direct beneficiaries of the research will be crystallographers, both in the academic and commercial sectors (eg pharmaceutical industry) but,significantly, the fundamental importance of structural information in all areas of biology leads to two further layers of beneficiaries. Beyond the crystallographer beneficiaries lie the collaborators who will obtain their structures of interest more rapidly and cheaply. Broader still, given the importance of protein structures to drug and vaccine design, pesticide development, biotechnological enzymes etc., the general public may genuinely be said to be the ultimate beneficiary. Society in general will benefit from better medicines and bioproducts. For the above benefits to accrue crystallographers must first be aware of the methodology and, secondly, find it accessible and easy to use. To address the first aspect we will publicise our work at national and international conferences. Of particular relevance, since we propose to incorporate our method into the MrBUMP program, is the CCP4 user community. As part of the CCP4 collaboration, Winn and Keegan run exhibition stands at major crystallographic conferences and act as tutors at crystallographic workshops. Such activities will be used to disseminate our findings, as part of the generaltraining of crystallographers in advanced methods. The broader scientific community will learn of progress through publications. For maximum publicity, we will choose open access journals. Engagement of the broadest group of beneficiaries - the general public - will take advantage of successful programs at Liverpool, initiatives illustrating the importance that the School of Biological Sciences places on increasing public awareness of the impact of biological research on British society. The School actively participates in the BBSRC 'Excellence with Impact' programme. Recent School activities are the co-founding of the University's Centre for Poetry and Science and the participation in public lectures and workshops at Liverpool's 2008 British Association Festival of Science. The School also encourages young biologists with visits to local colleges and School Open Days. Nuffield Bursaries allow for summer studentship placements that further stimulate scientific thinking among the young. We will take full advantage of the University's highly experienced Corporate Communications Team, to make science accessible to the general public via local media coverage and other forms of engagement. We will also access the University's Widening Participation Group, which in turn cooperates with the Educational Opportunities Group to provide an Education Liaison Service. This service arranges a wide spectrum of educational events, e.g. 4-day residential summer schools, 1-day taster days, talks for schools and colleges, a Christmas lecture programme for sixth-form students from local schools, and a newsletter to distributed to 130 Heads of Biology in schools and colleges. STFC also has an active public engagement programme, and indeed this is a major part of the Council's remit, as laid out in the Royal Charter. Previous work by Winn and Keegan has been included in the STFC Annual Report 2007-2008, in the annual CSED Frontiers magazine, and in many issues of the CCP4 newsletter.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsStructural Biology, Technology and Methods Development
Research PriorityTechnology Development for the Biosciences
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file