Award details

Wormbase-ParaSite

ReferenceBB/K020080/1
Principal Investigator / Supervisor Dr Paul Kersey
Co-Investigators /
Co-Supervisors
Dr Kevin Howe
Institution EMBL - European Bioinformatics Institute
DepartmentEnsembl Genomes
Funding typeResearch
Value (£) 291,035
StatusCompleted
TypeResearch Grant
Start date 03/03/2014
End date 02/03/2017
Duration36 months

Abstract

WormBase-ParaSite will be a new database and user interfaces focused on parasitic helminths, i.e. roundworms and flatworms. Large numbers of these genomes are currently being sequenced and the new resource will solve the problem of badly-organised and inconsistently annotated genomes, which if unaddressed, will make these valuable data hard for researchers to utilise effectively. Over a period of 3 years, we propose to structurally and functionally annotate at least 200 genomes using well-documented, state-of-the-art approaches (e.g. use of Augustus/Maker, RFAM, InterPro etc.), and to perform comparative analyses based on selective pairwise and multiple DNA alignment, HMM clustering of protein sequences, and evolutionary analysis of protein families. We will develop new data mining tools (based on established data warehousing infrastructure such as BioMart or InterMine) to allow users to efficiently extract information from the database, offering queries relating to e.g. genetic variation, gene content, taxonomic distribution etc. The user interface will be based on the Ensembl platform and support the ability to compare reference and other data stored in the resource to user-generated data stored in standard file formats (e.g. BAM for alignments, VCF for variants, etc.) via easy-to-use upload/visualisation tools . Apart from this interactive interface, the proposed resource will also allow programmatic access to the data. Frequent data releases will ensure the prompt availability of data and analysis results to the community. To allow ongoing improvement of the annotation, we will deploy a community curation tool, while we will also work closely with WormBase, the database for the model nematode C. elegans, which is expected to provide an eventual home for some of the genomes of highest interest. Finally, we will run an open workshop aimed at training scientists in efficiently using the database resource and its visualisation and analysis tools.

Summary

Flatworms and roundworms are diverse groups of organisms and include those responsible for serious human, veterinary and plant diseases. Their global impact is hard to measure but annual human morbidity is estimated to be equivalent to at least 50 million productive years of life (c.f. 85M for HIV/AIDS), and agricultural losses from plant parasitic nematodes can be measured in hundreds of millions of dollars. Despite their impact on human health, a recent report highlighted that parasitic helminths attract only $77M per annum in research expenditure (cf. $1.1billion for HIV/AIDS). Due to the wide range of pathology caused in various host species, no single model species can capture the range of disease-causing mechanisms involved. Parasitologists are therefore inherently interested in making comparisons between many different species. Access to genomic-scale datasets has revolutionized molecular and cell biological studies of protozoan pathogens, advancing basic and applied research, but this is only now starting to happen for parasitic worms. Major sequencing programmes are now underway and large scale functional genomics datasets are beginning to emerge (e.g. RNA-Seq is becoming commonplace). Studies of genomic variation in multiple isolates, to address clinical, epidemiological or applied agricultural questions, are the obvious next steps. While the emergence of new genome-scale datasets is immensely exciting, genomes are often produced in a relatively poor states of assembly and annotation compared to existing reference genomes. Moreover, the prevailing paradigm for organising genomic information (essentially, the genome browser and the underlying data models that support this) are relatively poorly fitted to the exploration of hundreds of highly fragmented genomes with limited functional characterisation. In this application, we propose the creation of a new resource to organise, classify and allow the exploration hundreds of worm genomes, facilitating the exploitation of sequence-based data for understanding and ultimately controlling worm-induced pathology. The propose resource will be called WormBase-ParaSite, and will be strategically aligned with WormBase (the main resource for the model nematode Caenorhadbitis elegans). Specifically the resource will provide: (a) Gene structures and functional annotation for unannotated worm genomes. (b) Comparative genomic analysis, visualisation and querying. (c) Methods for exploration and data mining the complete data set, through an intuitive query-building interface accessible to research scientists. (d) A platform for the visualisation of the results of high-throughput sequencing experiments in the context of other annotation that enables functional genomics and variation studies. (e) An infrastructure for accepting and integrating functional annotation submitted from the community engaged in worm research. The project will complement the existing scope (and leverage the existing content) of WormBase, which has supplied biologists working on the model worm Caenorhabditis elegans with an invaluable information resource since the genome of this species was one of the first to be deciphered. It will provide additional capacity and tools for the handling of a massively increased quantity of genomes, with a clear focus on the information that is most relevant to parasitologists. The resource will also provide a home for data from the flatworms, such as flukes and tapeworms, which are outside the scope of WormBase.

Impact Summary

Parasitic helminths are studied with the aim of killing or controlling them. The proposed resource will significantly facilitate the exploitation and application of sequence-based data towards this aim. For pathogens with smaller and simpler genomes such as viruses and bacteria, genomic insights are already being translated into tangible benefits for medicine such as the ability to track pathogen transmission and the monitoring of drug resistance. Yet it is clear that the application of genomic science towards medical, veterinary and agricultural improvements is still in its infancy, and that many more and bigger benefits will be realized in the long term. In analogy, sequencing-based research is expected to deliver significant benefits in the fight against the diverse helminthiases afflicting humans, animals, and plants. Downstream beneficiaries will first and foremost be people directly suffering from helminth infections, which includes about 2 billion people infected with soil-transmitted helminths alone (WHO 2012). Advances in drug treatment, transmission reduction or vaccination could improve the lives of many people who may otherwise suffer from serious gastrointestinal disease, stunted growth and mental development, malnutrition and fatigue, disfigurement, blindness, or liver and bladder pathologies. Although some effective anthelminthics exist, the available arsenal of drugs is limited and makes the development and spread of drug resistance - especially with mass drug administration being a predominant tool for helminthiasis control in developing countries - a real danger. Furthermore, large-scale improvements in the treatment and control of helminthiases are likely to bring huge socio-economic benefits to some of the least developed countries. In addition to the direct health improvements from a reduction in helminth infections, people in endemic areas could also benefit indirectly e.g. by an improved response to vaccinations, by reduced transmission or by an improved disease outcome for other diseases such as tuberculosis, malaria, and HIV/AIDS, as co-infections with helminths have been shown to have potentially adverse effects (see Elliott and Yazdanbakhsh, 2012, and other articles in the same issue of this journal for recent reviews). In the UK, one major impact of helminth-related diseases is on agricultural production, especially in the potato and in the sheep and goat farming industries. Improved interventions and control measures against helminths such as Globodera, Teladorsagia, and Haemonchus spp. could therefore greatly benefit UK farmers and related agricultural and pharmaceutical industries. Globally, species of the genus Heterodera are significant nematode pests of various agricultural plants including cereals and soybean. The proposed resource will therefore fit squarely within the BBSRC's strategic research priority of 'Food security' and contribute to the areas of 'Crop science, 'Animal health', and 'Livestock production'. In addition, novel or improved helminth interventions could benefit pet owners and their companion animals and could help reduce the environmental impact of the large-scale application of nematicides for crop production. On the other hand, the proposed resource could ultimately also contribute to an improved use of beneficial nematodes as e.g. entomopathogenic nematodes to fight pine weevils. Looking further into the future, a thorough understanding of helminths and their interactions with the human immune system may lead to fundamental new insights that will allow a much more sophisticated manipulation of the human immune system for medical purposes. Similarly, scientists are just beginning to uncover the complex interactions between the gut microflora (i.e. bacteria), the macrofauna (i.e. helminths), and human immunity. Together, such knowledge may ultimately be exploited for and benefit the effective treatment of allergies and other (autoimmune) diseases.
Committee Not funded via Committee
Research TopicsSoil Science
Research PriorityX – Research Priority information not available
Research Initiative Bioinformatics and Biological Resources Fund (BBR) [2007-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file