Award details

The Jalview Resource for Sequence Analysis and Annotation

ReferenceBB/L020742/1
Principal Investigator / Supervisor Professor Geoffrey Barton
Co-Investigators /
Co-Supervisors
Dr James Procter
Institution University of Dundee
DepartmentSchool of Life Sciences
Funding typeResearch
Value (£) 513,686
StatusCompleted
TypeResearch Grant
Start date 01/10/2014
End date 30/09/2019
Duration60 months

Abstract

The analysis of sequences is carried out by all scientists interested in the functions of genes and proteins, and relies on techniques for sequence alignment. Working with alignments requires powerful, but easy-to-use interactive tools. The Jalview Resource is based on the Jalview open-source, GPL-licenced multiple sequence alignment editor and analysis workbench. Jalview (2.8) is installed on over 55,000 computers world-wide, and according to Google, is mentioned on hundreds of thousands of web pages. In applet form, Jalview is exploited by major databases such as Pfam and the EBI services. In addition to sophisticated multiple alignment editing functions for DNA, RNA and Protein sequences, that include multiple levels of "undo", multiple "views" and the ability to "hide" sequences and columns of an alignment, Jalview provides linked views of trees, DNA and Protein sequences, and protein three-dimensional structures. It communicates with web-services via SOAP or REST for CPU intensive analyses on remote servers, with six standard multiple alignment algorithms, four disorder predictors and the JPred secondary structure prediction server currently available at Dundee. Jalview also reads and displays annotations from more than 1,400 DAS servers available world-wide, as well as locally installed servers. Thus, Jalview is a net-centred application that puts distributed data and compute resources at the fingertips of its users. Our aim is to provide core, co-ordinating support for Jalview and its growing user community. Continued funding will help us to encourage and integrate voluntary code contributions from the Jalview community, and further develop the resource through user and developer workshops and on-line documentation. Support from BBSRC will ensure the existing BBSRC investment in Jalview is maintained and that Jalview development can continue to respond to the needs of its significant user base in the biological research community.

Summary

Comparison is a central part of everyone's daily life. If two objects look similar, then a first guess is that they might have similar functions. Comparison of the biological molecules DNA, RNA and proteins from different individuals or species is central to understanding what each molecule does. DNA, RNA and protein molecules can be represented as long "words" of a few letters up to thousands or even millions of letters in length. It is these words, or "sequences" that are compared by computer programs. For example, comparison of a protein sequence from a plant disease-causing bacterium with that from the plant may suggest ways to design a treatment to kill the bacterium and not the plant. Ideally, more than two sequences are compared at the same time as this makes it easier to spot common features. This process is called "multiple sequence alignment" and is one of the most powerful techniques available in modern biology. Easy-to-use but feature-rich interactive computer programs are essential to visualise and enable the most information to be extracted from multiple sequence alignments. Jalview, is the most widely used program for visualising sequence alignments and editing them. It is installed on over 55,000 computers world-wide, on hundreds of thousands of web pages, and is in daily use by thousands of scientists and students. Jalview not only allows alignments to be viewed coloured and edited, but also exploits the internet to connect to over 1,400 sources of sequences and annotations on those sequences. In addition, Jalview exploits state-of-the-art internet technology to connect to computer servers that can perform large calculations on the sequences and return these to the user in a few seconds or minutes. In this way the program puts a mass of human knowledge at the fingertips of the scientist and so saves them a huge amount of time in exploring the function of their favourite protein, DNA or RNA molecule. Some databases of alignments containalignments with over 200,000 sequences. New technology in DNA sequencing is enabling sequences to be determined faster than ever, so the total volume of sequence data available is doubling approximately once a year. However, sophisticated computer programs like Jalview require maintenance to fix bugs and to introduce new features as new techniques and data sources are developed world-wide. In order to ensure that Jalview can meet the needs of its large community in the face of the data onslaught, we are requesting support for staff to maintain and develop the Jalview Resource and train people how to use it. The principal aims of the resource will be: 1. to drive the maintenance and further development of the Jalview sequence analysis resource from October 2014 onwards. 2. To provide a range of training resources and regular workshops both for end-users of Jalview and for software developers who wish to add new features or incorporate Jalview in their own systems. 3. To facilitate the open development of Jalview as a community project. 4. To develop Jalview to be able to view, calculate properties of, and edit alignments of any size. 5. To migrate the Jalview applet that works on a web-page to Javascript. 6. To extend Jalview to provide flexible access to sophisticated state-of-the-art tools for sequence search and alignment in collaboration with their developers. 7. To extend Jalview to support state-of-the-art tools for the calculation of evolutionary trees. 8. To continue to seek new funding sources to expand the Jalview development team and improve the long-term sustainability of the resource. The applications of the jalview resource are many fold, from a teaching tool, though a figure-for-publication-making tool and expert analysts' workbench, to a neat way to display sequence alignments on a web page. The applications of Jalview span all areas of biology and so the benefits of the resource are legion.

Impact Summary

The Jalview Resource is widely used by the international biological sciences community. This has impact to all areas of academic BBSRC research as well as MRC funded, Wellcome Trust funded and other research councils and charities that support research involving genome or protein sequences. Users of the Jalview Resource span academia across all biological subject areas and researchers in the pharmaceutical, agrochemical, agricultural and animal breeding industries where the analysis of protein sequences and their functional context is important to the economic success of the company. The wider impact of the Jalview Resource beyond BBSRC research has been highlighted by the recent award of a Biomedical Resources Grant from the Wellcome Trust. As such, the Jalview Resource will continue to have both Economic and Societal impacts by speeding up the accuracy and depth of inference possible from sequence data and so increasing the competitiveness of its users in academia and industry. Improved competitiveness of the users of the resource across such a wide range of academic and industrial domains is likely to lead to improved competitiveness for the UK. The Jalview Resource is also widely used by educators to teach students in life sciences disciplines both basic and advanced sequence analysis. This educational role will enhance the knowledge and expertise of future generations of biologists and technologists working in academia and industry across all molecular life sciences disciplines in the UK. Further beneficiaries will be attendees at the annual training workshop that will be run to teach potential users both the scientific background to the methods exploited by the Jalview Resource and the practical use of Jalview on their specific problems. The training workshops will be open to graduate students, postdocs, academics and members of industry. For those who can't attend the workshops, the on-line e-learning materials will provide similar information backed by informal email support. New e-learning materials developed in this proposal will enable the Jalview Resource to reach out to scientists of the future through schools and a wider audience of interested individuals in the general public. We will establish teaching materials and guides aimed at project-level work for students at A-level or equivalent (e.g. Scottish Higher/Advanced Higher) and introduce local school teachers to the Resource through workshops. The Jalview Resource is primarily a scientific research tool aimed at accelerating scientific discovery and maximising the benefit of investment in sequence data generation. However, it can also be explained to the general public through public displays and open days. We have experience of public outreach through the annual "Doors Open Day" at Dundee and through the development of our GenomeScroller exhibit that provides an exciting backdrop on which to explain the human genome, how big it is and how much (and how little) is understood about how it functions. In Year 2 of this grant and subsequent years, we will aim to display and explain outputs of the Jalview Resource alongside GenomeScroller in order to introduce a new audience to the power and excitement of bioinformatics research.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsX – not assigned to a current Research Topic
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file