BBSRC Portfolio Analyser
Award details
UCL Biosciences Big Data
Reference
BB/R01356X/1
Principal Investigator / Supervisor
Professor Richard Mott
Co-Investigators /
Co-Supervisors
Institution
University College London
Department
Genetics Evolution and Environment
Funding type
Research
Value (£)
563,333
Status
Completed
Type
Research Grant
Start date
01/05/2018
End date
31/05/2019
Duration
13 months
Abstract
The proposal creates a new infrastructure for scientific computation within the UCL Division of Biosciences, to enable the growth of Big Data analysis over the next four years. This supports the work of Biosciences research groups in genetics, bioinformatics, protein structural studies, molecular biology, developmental biology, ecology, evolution, biophysics and biophotonics. Much of this research portfolio is funded by the BBSRC It comprises a fast network at 10Gb/s to connect imaging equipment in different buildings to a centralised datacentre. the data centre will comprise 3PtB RAID5 storage attached to at least 700 compute cores each with 8GB RAM arranged in groups of 24 cores, and finally a 48-core server with 1TB of shared RAM for parallel computations involving large shared memory, such as some image processing and de-novo assembly of genomes. This integrated suite of equipment will support the forseeable High-Performance Computing needs of UCL Biosciences for the next 4 years.
Summary
Much research in the biosciences is transitioning away from experimental work towards computational analysis. This involves a massive increase in the volume of data to be analysed and in the computational power to analyse it, and in the speed and capacity of computer networks to move data from the collection instruments to the compute clusters used to analyse it and then onto data archives. University College London is one of the leading research-focused univsesities in the world. The UCL division of Biosciences undertakes an increasing amoutn of research involving Big Data. This includes large-scale biological and cellular imaging and DNA sequencing, much of it funded by the BBSRC and other UK research organisations. The proposal aims to put in place a new computational infractructure within UCL Biosciences that links its data generation groups together with augemted compute and archive facilities. This involves installing a high-speed network capable of transferring up to 10 Gb/s from data collection instruments to the computer clusters and data archives managed in the UCL Depts of Computer Science and Research Computing. It also involves purchasing over 700 computers for data analysis and over two petabytes of disk storage, in a combination of high-speed disk attach to thte compute cluster and lower speed cheaper disk for archiving. This combination of network, compute and storage will position UCL Biosciences to grow as a UK centre of research excellence over the next 4 years.
Impact Summary
Who might benefit from this equipment? Researchers in UCL Biosciences funded by BBSRC will see an immediate benefit. These include all researchers using High Performance Computing, particularly in the fields of genetics, bioinformatics, protein structural studies, molecular biology, developmental biology, ecology, evolution, biophysics and biophotonics. Over 20 research groups in UCL Biosciences are likely to be immediate beneficiaries. Collaborators and other UCL scientists will also benefit. Longer term, the BBSRC and the wider community will benefit by the accelerated rate of research that will be produced. Collaborators working on crop improvement in Lower and Middle Income Countries (LMIC) will benefit. How might they benefit from this equipment? (i) Researchers using the BBSRC-funded imaging facilities in UCL Biosciences will be able to transfer their images over the new 10Gb/s network to be implemented as part of the equipment grant, and store them on the >2Ptb . They will then be able to process their images using the >700 compute cores and high-memory server. (ii) Researchers using population DNA sequence data will be able to perform analyses such as mapping reads from multiple individuals to the reference geneome, and creating de-novo assemblies of large genomes. They will be able to analyse very large datasets (~50tB) which at present is very difficult due to space and compute resource limitations. These researchers include (a) Prof Richard Mott, who has BBSRC/GCRF research programmmes into crop improvement for wheat, rice and chickpea involving LMIC collaborators, analysing thousands of crop genomes. (b) Prof Chris Thompson, working on Dictylostelium genetics and evolution of social and non-social traits will be able to analyse ~1000 genomes. (iii) Researchers using the CATH protein database developed by Prof Christine Orengo will be able to perform large-scale searches of large metagenomes providing functional characterisation of metagenomes acrossenvironments to detect novel bacterial relatives of CATH families associated with antibiotic resistance. Understanding antimicrobial resistance is a strategic priority for the BBSRC. The Case for Support describes more research projects that will benefit directly from this equipment.
Committee
Not funded via Committee
Research Topics
X – not assigned to a current Research Topic
Research Priority
X – Research Priority information not available
Research Initiative
Advanced Life Sciences Research Technology Initiative (ALERT) [2013-2014]
Funding Scheme
X – not Funded via a specific Funding Scheme
I accept the
terms and conditions of use
(opens in new window)
export PDF file
back to list
new search