Award details

A Community Resource for the Prediction of Protein Structure: PHYRE

ReferenceBB/G022569/1
Principal Investigator / Supervisor Professor Michael Sternberg
Co-Investigators /
Co-Supervisors
Institution Imperial College London
DepartmentLife Sciences
Funding typeResearch
Value (£) 311,004
StatusCompleted
TypeResearch Grant
Start date 01/11/2009
End date 31/10/2012
Duration36 months

Abstract

Our current server for protein structure prediction (PHYRE) is used by 100s of groups worldwide due partly to a user interface that is informative and easy to use for biologists. Recent major improvements made in our lab to the underlying algorithms for structure prediction have been shown to lead to world class modelling accuracy in the recent CASP8 blind trial of structure prediction. We propose to bring these improvements to the biologist via a new server with a range of additional powerful user features: 1) An interface for automatic modelling of multi-domain proteins by iterative fold recognition, multiple template modelling, a powerful new HMM profile matching algorithm and homology network data-mining approach. 2) Extensive help documentation, tutorials, and three 1-day workshops. 3) Robust and up-to-the-minute fold library maintenance including regular total reconstruction protocols to stay current with mounting sequence data. Extensive error-checking routines, e.g. updating PDB entries as higher resolution templates become available. 4) Using CDD and PFAM, long user sequences will be automatically parsed for clear domain boundaries via an interactive web interface to manage multi-domain proteins and selective domain modelling. 5) Integrate 3D model quality estimation using state-of-the-art tools from CASP MQAP category. 6) Improve model visualisation options e.g. rendering models according to: alignment and/or model quality, disorder, sequence motifs, sequence conservation, evolutionary trace etc. 7) Include prediction of other structural/functional features with extant software, e.g. transmembrane helices, coiled coils, repeats and functional residue predictions. 8) Design an expert mode to allow user-selection of templates in multi-template modelling, private submission of user-supplied structures to thread against and batch processing of multiple sequences. 9) Reinstate and update a previously successful functional text data mining approach.

Summary

Proteins are large molecules that are the machinery of life. They are long chains of different components and the order of these components is the amino-acid sequence. The genome projects are now determining the sequences of proteins from many species including human, plants, animals and microbes. Experimental methods can reveal the 3D structure of a protein, and this information is central to basic biological understanding and the exploitation of this biological knowledge has major implications for improvements in health, agriculture, animal welfare and the environment. However, generally this essential information is not available from experiment. Biologists then require computational methods to predict this information. The Sternberg group has developed a powerful and user-friendly resource for predicting the 3D structure of a protein from its sequence. The first version was 3D-PSSM and the more recent version is known as PHYRE. This is disseminated via a web server - a user pastes their protein sequence of interest into a box and the server returns details of the predicted 3D structure with atomic coordinates and additional information. This resource has proved highly popular with the community. There have been over 130,000 submissions and the current rate is 1,000 per week. There have been over 1,200 citations to the two main papers describing 3D-PSSM and PHYRE. This grant will provide support for us to maintain, support and develop the PHYRE web server. The grant will support the following topics. 1) Recent developments which lead to a significant improvement in performance have not yet been incorporated into the software available to the community. We will develop an appropriate web interface to the new version founded on the successful current design principles. 2) The program requires updates of the databases used in the prediction and at present the procedure is managed manually and is computationally time consuming. We will automate and improve the procedure. 3) A number of other computational tools have been developed by groups around the world to predict structural and functional characteristics of proteins that complement PHYRE. These will be integrated into the server to provide a hub of information about a protein of interest. 4) End-user biologists need to know when to trust predictions. Hence we will augment the existing measures of confidence based on protein sequence information with cutting-edge tools that estimate prediction quality based on 3D information. This will permit the biologist to ascertain which regions of a protein model are trustworthy and which are not for use in subsequent theoretical or wet-lab work. 5) Visualisation is key when dealing with complex three-dimensional protein structures. Hence we will substantially extend the user's ability to plot a variety of predicted features mapped onto 3D model predictions. 6) We will provide e-mail user support together with extensive documentation. In addition, we will run three hands-on workshops for biologist interested in using the methodology. The work will be disseminated by publications in the scientific literature and presentations at national and international meetings.
Committee Closed Committee - Engineering & Biological Systems (EBS)
Research TopicsStructural Biology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file