Award details

COBrA-CT: An e-Science tool for ontology curation

ReferenceBB/D006473/1
Principal Investigator / Supervisor Professor Bonnie Webber
Co-Investigators /
Co-Supervisors
Dr Stuart Aitken, Professor Jonathan Bard
Institution University of Edinburgh
DepartmentSch of Informatics
Funding typeResearch
Value (£) 61,105
StatusCompleted
TypeResearch Grant
Start date 01/10/2005
End date 31/03/2007
Duration18 months

Abstract

This project will develop tools and techniques for curation that will advance the state of the art in bio-ontologies. Currently, only the simplest version control is used for managing the development cycle of bio-ontologies, and no tool support is available for analysis and quality control: Logic-based analysis can only be done on DL-based ontologies. The COBrA-CT ontology tool will be an extended version of the COBrA bio-ontology editor. This is an ontology editor that provides a dual-view, allowing the user to explore an ontology through a tree-based visualisation, that shows the position of a concept in the hierarchy, which is kept synchronised with a node-based visualisation. The node-based view is a summary of all facts about a concept, and this view also provides a means to explore the ontology by hyperlinked navigation. Both visualisations are viewable at the same time when editing a single ontology. The user can also view two ontologies at the same time by selecting either the node-based or the tree-based visualisation of each ontology. The motivation for dual-view was to support ontology mapping. This requirement arose in the XSPAN project in order to create links between analogous and homologous tissues in two different species through their OBO anatomy ontologies. In the work proposed here, dual-view will serve to visualise and annotate differences between two versions of the same ontology - a need that arises in curation. As these links (mappings) are first-class citizens, they can be exchanged and used in curation. This will enable COBrA-CT to allow ontology edits to be linked to descriptions of why they are being proposed: Curators will be presented with not only a graphical representation of the changes, but the rationale for the changes as well. Editing and linking anatomies are instances of the more general problems of managing and linking (mapping) bio-ontologies. Thus, in the proposed project, we shall extend COBrA with new analysis functions, and support a model of distributed curation by implementing an ontology management server with which COBrA users (be they curators, or biologists in the field) can communicate. This communication will use Web service and GRID interfaces. The technical contributions are: - A tool that supports enhanced ontology editing and critiquing. No currently available tool tailored for the bioinformatics community supports ontology development beyond providing a tree-based visualisation. COBrA-CT will provide conceptual and logical analysis functions that will enable the user to critique the changes they themselves make to the ontology. The rationale for ontology edits will be captured in meta-data (ontology mappings). - A distributed curation model, supported by a central ontology management system. COBrA users will be able to make both local quality checks, and checks against the centrally-held ontologies. We believe this will increase the acceptance rate of the changes submitted to the curators by biologists in the field. Curation will follow an explicit process model in order to manage the task. Visualisations of two versions of an ontology will use the dual-view feature and the meta-data. - Ontology management based on XML keys. XML keys have been proposed for version control on scientific data and we shall apply this method to ontologies. This technique as relevant as we shall ensure all ontologies are represented in the Web Ontology Language, which has an XML syntax. Where possible, we shall also exploit the logic that the ontology is encoded in to make semantic checks. But we note that many bio-ontologies make no use of any formal, explicit logic and so we are left with structured data - which we shall manage as such. - Empirical usability trials that we will conduct in order to validate the design and claims of improved acceptance.

Summary

The long-term storage and management of scientific data is of great importance not only to provide a permanent record of valuable information, but also to provide proper methods for coping with updates to data sets. Without some form of version control, it is impossible to repeat analyses performed with an earlier version of the data, or to know why a particular item of data has been changed. In the life sciences, generating and publishing data has been of prime importance. Along with this data is the associated annotation. This annotation may identify the cellular location at which a gene was found to be expressed or the function it is considered to be performing. Ontologies such as the Gene Ontology and the anatomical ontologies for the model species are important resources for annotation. They play the role of a shared vocabulary for assigning meaningful descriptions to data. Like the data, ontologies are subject to change as our understanding grows and evolves. Because of this, ontologies must be curated and archived. The creation, editing, and publication of an ontology will be followed by many cycles of revision. What about the data that are annotated with respect to an ontology that has subsequently been revised? Errors and erroneous conclusions can follow if different data in the same database are annotated using different versions of an ontology, at best, data may be annotated with terms that have been rendered obsolete so no conclusion can be drawn. In an ideal world, changes to an ontology would propagate with respect to its terms. What can be done as a step towards this ideal? The aim of the proposed project is to build an ontology tool that is capable of being used by biologists and ontology curators such that additions and changes to the ontologies are managed as first-class citizens at a central point. The curators will have rights to accept changes proposed by others, and will have the right to publish new versions of the ontology. All users will access the ontology through the same tool, COBrA-CT, which we propose to create by extending the existing COBrA ontology editor. To achieve compatibility with other e-Science initiatives, we shall adopt an existing framework used in the Protege tool (a widely-used ontology editor developed at Stanford University). We shall cooperate with the Database Group in Informatics to apply their data-archiving techniques to ontologies.
Committee Closed Committee - Engineering & Biological Systems (EBS)
Research TopicsX – not assigned to a current Research Topic
Research PriorityX – Research Priority information not available
Research Initiative EDF (e-science Development Fund) (EDF) [2003-2005]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file