Award details

SynDNAStore. Synthetic biology innovation around the design of DNA molecules for digital archiving

ReferenceBB/L023741/1
Principal Investigator / Supervisor Dr Nick Goldman
Co-Investigators /
Co-Supervisors
Dr Ewan Birney
Institution EMBL - European Bioinformatics Institute
DepartmentGoldman Group
Funding typeResearch
Value (£) 511,733
StatusCompleted
TypeResearch Grant
Start date 01/04/2015
End date 31/08/2018
Duration41 months

Abstract

This is a high risk synthetic biology proposal to further the aims of using DNA as a viable long-term Digital storage media. This builds upon previous published work. We have four specific aims: (1) to explore the copying behaviour of a number of different schemes (PCR based, Rolling circle and RepA based schemes), quantifying both per-base error rates and (more importantly for this proposal), per-fragment replication rates. (2) To design information coding schemes for DNA which can maximise the amount of information encoded in DNA whilst also optimised for low error rate and even copying behaviour. (3) To develop a physical indexing scheme for Digital DNA storage, allowing separate compartments to independently retrieved and (4) To create a mid-scale (~Terabyte) level storage of culturally important digital works.

Summary

There is an increasing amount of information stored in digital media; films, TV programs, pictures, mp3s, PDFs and others. Indeed, a considerable amount of information is "born digital" in that it was generated at the outset in a digital media (for example, digital cameras). Despite the every day use of digital storage methods for this information, it is far more complex to store digital information over a long period of time, ie, over decades. Current best practice involves storing information on magnetic tape, and regularly transitioning between upgrades of the tape technology, between 5 to 10 years. DNA, the chemical which stores the information passed on between generations is a naturally digital storage scheme; it has four possible "letters" (A, T, G and C). Because of the extensive research that has occurred over the last forty years, we can easily read, cut, copy and splice DNA together. More recently it has become feasible to also write DNA to a design of your choosing. Previously we showed that we could use DNA for a very different purpose than store genetic information, namely to store any digital information. In our first trial we stored all of Shakespeare's sonnets, a MP3 extract of Martin Luther King's "I have a dream" speech, a JPG of a picture of the EBI Building, and the PDF of the Watson and Crick publication in Nature in 1953. We can be confident that this information can survive for over 1,000 years. In this proposal we want to expand on this work to exploit the fact that one can copy accurately and cheaply DNA information. To do this we need to know at far better precision the engineering properties of DNA enzymes - do they work evenly on every sequence? What is their precise error rate? Using this information we will be able to design a Digital DNA store which can be replicated 100s or 1000s of times easily. As a proof of concept we would also like to store a more substantial amount of culturally important information in DNA - for example,the entire works of Shakespeare, or the UN declaration of human rights. With this new work, and with the expected advances in synthesis we expect to be able to store a substantial amount. To choose the precise works we store we will engage with the public, and also design an appropriate physical storage location.

Impact Summary

This proposal has impacts on Academic, Economic and Societal levels. The Academic impacts include: (1) the development of a compelling piece of synthetic biology design, showcasing the blend between engineering and biological components. (2) the extensive exploration of the error rate and amplification rate of DNA copying enzymes at a level of precision suitable for engineering design. We will publish this work in appropriate journals and provide all the data backing up this information and (3) the training of two individuals in synthetic biology, one from the experimental side and one from the theoretical/electrical engineering side. The Economic impact in the long term will be the provision of a robust, scaleable long-horizon digital archiving mechanism. This will allow both commercial and public sector organisations to effectively and with low management costs to archive information over multiple decades (the theoretical horizon of this information is in millennia, but the commerically important component of this is in the 20-100 year time scale). To achieve this we will need to commercialise this technology, either via a technology orientated small company or via a forward thinking large company; in either case this will attract R&D investment into the UK. The Societal impact will be via a compelling piece of public engagement. We propose to archive a selection of culturally important digital works, and the technology truly has a multi-millennia horizon in terms of longevity. To choose the culturally important digital works we will engage with Arts and Humanities colleagues and the general public, potential via high profile TV, Radio or festival components. This will contribute to the public understanding of science.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsSynthetic Biology, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file