Award details

Research Infrastructure

ReferenceBBS/E/F/000PR10352
Principal Investigator / Supervisor Dr Andrew Page
Co-Investigators /
Co-Supervisors
Dr Nicol Janecko, Dr Gemma Langridge, Professor Mark Pallen
Institution Quadram Institute Bioscience
DepartmentQuadram Institute Bioscience Department
Funding typeResearch
Value (£) 1,082,170
StatusCurrent
TypeInstitute Project
Start date 01/04/2018
End date 31/03/2023
Duration47 months

Abstract

World class data intensive bioscience requires world class infrastructure and software. This theme focuses on the bioinformatics research and development to build the core software and informatics offerings that will underpin the research undertaken within the Quadram Institute more generally, and within the Microbes in the Food Chain ISP specifically. The project will design and provision a local cloud computing environment with launcher and software stack to enable researchers within the Quadram Institute to utilise standardised bioinformatics pipelines (developed within this theme and by core bioinformatics activity at the Quadram), or to develop their own software. The cloud system will be defined using DevOps tools (ansible and salt), which will be shared externally, so others can develop their own system externally. Our local cloud system will be based on open source software, and the launcher will be built to make use of cloud APIs to enable future system expansion and to enable it to access other NBI cloud resources. As part of this development we will aim to accredit our system to ISO 27001 and will deploy next generation firewalls locally to provide additional security to collectively enable the holding of sensitive data. There are a wide range of technologies in use across the Quadram Institute for the generation of ‘omics-type data. All of these generate some form of digital data, which is often linked (e.g. all coming from the same sample, or the same experiment, or the same project). Building on our cloud resource, we will develop a system to collate, integrate and catalogue this data, providing a core set of analysis tools to enable mining of that data. This system will enable data to be presented for more advanced methods such as AI/Machine learning. Because we will be collecting and collating experimental data within the Quadram Institute, there is an opportunity for the presentation of this data to external users, to enhance the value of the data generated and to enable external users to simultaneously access a range of data types linked to individual samples. This will complement the upload of data to resources such as the European Nucleotide, providing a ‘next generation culture collection’ that includes data beyond what is currently stored in an NCBI bioproject, for example. The next generation culture collection will include the ability to order strains from the Quadram Institute, and will feature a set of tools for data export and data analysis using computational resources hosted within the NRP institutes. Lastly, to support research activities within the Quadram Institute we will develop a simplified sequencing pipeline combining automated QC, de novo assembly, variant calling and isolate characterisation. The system will be designed to be deployable as a virtual machine, and will be designed to support the attachment of sequencing instruments (including Illumina and Nanpore) to the VM, providing an end-to-end analysis solution for small labs as well as Quadram Institute researchers. We will develop the pipelines using NextFlow, and will seek to add to these tools as new software and technologies become available.

Summary

unavailable
Committee Not funded via Committee
Research TopicsMicrobiology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file