Award details

Validation of biomacromolecular structures determined by NMR spectroscopy and deposited in the Protein Data Bank

ReferenceBB/J007471/1
Principal Investigator / Supervisor Professor Gerard Kleywegt
Co-Investigators /
Co-Supervisors
Dr Aleksandras Gutmanas
Institution EMBL - European Bioinformatics Institute
DepartmentProtein Data Bank in Europe
Funding typeResearch
Value (£) 307,760
StatusCompleted
TypeResearch Grant
Start date 01/04/2012
End date 31/03/2015
Duration36 months

Abstract

NMR structures in the Protein Data Bank (PDB) contain errors that often go undetected. The errors originate from limited data quality and quantity and complex computational procedures, as well as inherent dynamics of the biomacromolecules. There is currently no extensive mandatory validation of NMR structural data before deposition to the PDB and BMRB, nor is there a complete understanding of the limitations of current validation servers (CING and PSVS). The clear need for improved NMR-derived biomacromolecular structures forms the basis of the four objectives for this proposal: 1. To implement the recommendations of the wwPDB NMR Validation Task Force (VTF) into an integrated software pipeline. This work will build upon the X-ray validation pipeline currently under development at PDBe. The pipeline will allow for validation of structures prior to deposition to the PDB and BMRB and will be used by all wwPDB partners. In addition, the pipeline will be used to assess the quality of all NMR structures in the PDB and the results will be made freely available. 2. To critically assess the utility, scope and limitations of current NMR validation tools. This work will build on the results of Objective 1 as well as prior research in the Vuister group using the CING validation software. 3. To develop new algorithms, procedures and tools for validation of high-resolution NMR structures, addressing issues such as dynamics and sparse data. 4. To disseminate validation-related information as well as newly developed validation methods for use by the wider scientific community. This will include development of new visualisation services at PDBe to help expert and non-expert users assess the quality of any NMR structure in the PDB. It will also include development of new publicly-available validation tools at Leicester. PDBe will be the scientific lead for work on Objectives 1 and 4, and Leicester for work on Objectives 2 and 3.

Summary

The proposed research addresses the quality of 3D structures of important biological molecules such as proteins, nucleic acids and their complexes. Knowledge of these structures is essential in many areas of science and helps us understand the molecular basis of life and disease processes, design better drugs, improve the efficiency of enzymes used in the food, paper or agriculture industry, etc. Structural data is archived in a single, freely accessible, global archive called the Protein Data Bank (PDB). The structural information is deposited in the PDB by academic and industrial researchers from all over the world and is used by other scientists to advance our knowledge and understanding of human health, drug discovery, agriculture, etc. Every month, more than 25 million PDB structure files are downloaded from the websites of the four organisations that manage the PDB, the wwPDB consortium. Of more than 73000 entries in the PDB, about 86% were determined using a technique called X-ray crystallography, while 13% come from Nuclear Magnetic Resonance (NMR) spectroscopy. Structure determination by NMR typically involves weeks of data collection and manual analysis of complex spectra. As with any experimental technique, NMR structures may contain errors. To ensure that the data deposited in the PDB are reliable, there is a need for comprehensive validation, which tells both producers and users of structures how good (or bad) their structures are, both in absolute terms and compared to other structures. This helps NMR spectroscopists to produce better models, and structure users in academia or industry to better judge the quality of the data they want to use or to select the best data for their purposes. To address this issue, expert validation task forces (VTFs) have been set up by the wwPDB, which will recommend what validation methods should be used and which areas still require further research. Our aim is to improve validation of NMR-derived structures. The objectives of this proposal are: 1. To implement the recommendations of the NMR VTF in an integrated software pipeline. This pipeline will be used to assess the quality of all NMR structures already in the PDB and of all structures that will be deposited in the future. 2. To critically assess the utility, scope and limitations of current NMR validation tools. This will reveal why current validation methods sometimes fail, amongst other things. 3. To develop new algorithms, procedures and tools for validation of NMR structures. Any limitations and weaknesses identified in the previous step will be addressed here. 4. To disseminate validation-related information and newly developed validation methods for use by the wider scientific community. This ensures that NMR spectroscopists will be able to produce better models, and that non-expert users will be able to assess the quality of NMR structures available to them. One applicant, Prof. Kleywegt is head of the Protein Data Bank in Europe (PDBe). PDBe is a founding member of the wwPDB and is part of the European Bioinformatics Institute, a world leader in bioinformatics research and services. PDBe has extensive expertise in all major structure determination techniques and is already implementing the recommendations of the wwPDB X-ray VTF. The co-applicant , Prof. Vuister, is part of the Department of Biochemistry of the University of Leicester (UoL), home to 3 NMR- and 2 X-ray groups. He is a leading expert on validation of NMR-derived structures and is a member of the NMR VTF.

Impact Summary

Structure determination and the study of interactions using biomacromolecular NMR techniques is still a rapidly growing field, with many new applications being developed all the time. Both automation and integrated pipelines have gradually changed the structure determination process from a highly expert undertaking to a more routine tool for biological research, albeit one that still requires dedicated technical expertise. The current proposal addresses an important aspect of the structure determination process: validation of the resulting models in relation to both prior knowledge (chemical, physical and biological) and specific experimental data collected on a sample containing the molecule(s) of interest. Careful validation often allows detection and remediation of potential problems commonly encountered for NMR-derived structures. As a result, more reliable NMR structures will be obtained. We expect that the present project will contribute significantly to (a) quality improvement of all biomacromolecular NMR structures to be deposited in the PDB in the future, and (b) awareness of the quality of all existing NMR structures in the PDB (13% of the archive). These structures will then form a better starting point for understanding their biology, for protein engineering, homology modeling and drug design. The results of the project will strengthen the UK scientific innovative capacity and fit the BBSRC research priority "Technology development for bioscience". The users will include researchers at academic and government institutions, industrial laboratories as well as students and teachers with an interest in structural biology. The tools, information and resources produced in this project will become widely available, as both partners have extensive experience in the development of web-based services. It is the mission of PDBe to curate newly deposited structures and to provide structural data and advanced services and resources to the worldwide scientific and industrial community. Their web servers, part of the EBI data centres, process millions of requests every month. The NMR group at the University of Leicester has been at the forefront in the development of validation tools. Their dedicated iCing validation server will be expanded to accommodate the new tools. The server already performs over 1000 validation runs on a yearly basis with requests originating from all continents, except Antarctica, and demand is still increasing. User input will provide important feedback regarding the general applicability and usability of our tools. Both applicants participate in many international collaborations. PDBe is a partner in the Worldwide PDB consortium and EMDataBank and has participated in many EU-funded projects. While previously located in Nijmegen, the NMR-validation research program now at the University of Leicester was and is a participant in many EU-funded NMR-oriented projects, such as WeNMR. The latter effort is aimed at the development of a virtual research community (VRC) that will strengthen biomacromolecular NMR as a tool in biological research. The proposed research will both benefit from and strengthen this EU effort, as the WeNMR VRC will be an excellent platform for the dissemination of some of the results of the project. The post-doctoral researchers for whom support is requested here, will work in an international and scientifically excellent research environment. The project will require them to utilise and develop their technical, scientific and personal skills. It is expected that the interactions within the respective groups in Hinxton, Leicester and Nijmegen will also raise awareness of the project with other post-doctoral researchers and graduate students active in the structural biology field. In fact, these researchers comprise an important target group of users of the tools that will be developed in the project.
Committee Research Committee D (Molecules, cells and industrial biotechnology)
Research TopicsStructural Biology, Technology and Methods Development
Research PriorityTechnology Development for the Biosciences
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file