Award details

Homomorphic Encryption of Genotypes and Phenotypes for Quantitative Genetics

ReferenceBB/V00767X/1
Principal Investigator / Supervisor Professor Richard Mott
Co-Investigators /
Co-Supervisors
Institution University College London
DepartmentUCL Genetics Institute
Funding typeResearch
Value (£) 403,034
StatusCurrent
TypeResearch Grant
Start date 01/03/2022
End date 28/11/2025
Duration45 months

Abstract

Quantitative genetic analysis - such calculating heritability, testing genetic association, using mixed linear models to control for unequal relatedness between individuals - is a cornerstone of several important areas of genetics, including human complex disease mapping, and animal and crop improvement. To make progress it is often necessary to share data between studies, but privacy concerns sometimes prevent or delay data sharing. We previoiusly developed a method based on the use of random orthogonal matrix keys to encrypt genotype and phenotype plaintext into cyphertext that closely resembles samples from Gaussian deviates. Orthogonal transformation leaves unchanged keys parts of the quantitative genetic machinery, including the likelihood, parameters, heritability and the effects of a mixed model transformation. However, they scramble the identities of individuals by replacing individual genotypes with random linear superpositions. We propose to develop the use of random orthogonal matrix keys, into a fully-fledged methodology and software package that can be used routinely by genetics researchers to share and analyse genetic data. We will also extend the methodology to other datatypes such as transcriptomic data, provided the analysis fits with a mixed model framework with Normal errors. We will aim to identify and correct any weaknesses that might permit decryption, and to work with potential users of the system in both human, plant and animal genetics, to propagate its use and thereby accelerate the sharing of genetic data, and of the the use of the FAIR (Findable, Accessible, Interopeerable and Repoducible) principles.

Summary

In order to identify genes that are associated with important traits like disease in humans, or improved yield in crops, it is necessary to analyse very large samples of individuals. Often this involves sharing genetic and other data collected in different studies, and there are risks to individuals' genetic privacy if these data are shared as plaintext. Homomorphic encryption refers to a type of data encryption that obscures the original plaintext data by replacing it with a ciphertext which nonetheless contains sufficient structure that it is still possible to perform the same data analyses as with the plaintext, thereby increasing the power to make discoveries whilst maintaining genetic privacy. We have previously developed a method for homomorphic encryption of genotype and phenotype data, based on random high-dimensional rotations of data. In this proposal we will develop our method into a practical tool that can be used by geneticists and other scientists. This will involve writing a software implementation that can operate on very large datasets, and working closely with stakeholders to ensure the code is as useful as possible.
Committee Research Committee C (Genes, development and STEM approaches to biology)
Research TopicsTechnology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative X - not in an Initiative
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file