Award details

Using Solexa/Illumina methods to investigate plant pathogen variation and transcriptome

ReferenceBB/F016190/1
Principal Investigator / Supervisor Professor Jonathan Jones
Co-Investigators /
Co-Supervisors
Institution University of East Anglia
DepartmentSainsbury Laboratory
Funding typeResearch
Value (£) 105,942
StatusCompleted
TypeResearch Grant
Start date 13/03/2008
End date 12/06/2009
Duration15 months

Abstract

Using a massively parallel sequencing approach, the Illumina Genome Analyzer (www.illumina.com) can generate more than one billion base pairs in a single run. Two runs will be enough to re-sequence and re-assemble one strain of Hp. The assembly will be done based on the available Hp Emoy2 reference genome. The goal of our new SAGE protocol based on the SMART cDNA method is to enrich for tags at the 5' UTR of the mRNA instead of tags generated in the current protocol from random DpnII or NlaIII sites. In order to sequence the whole transcriptome, starting from total mRNA of an infected leaf, a normalization step is necessary to maximize representation of less abundant genes. In planta expressed Hp genes will be enriched by a cDNA selection method established in the lab (Rougon and Jones, unpublished). To achieve random fragmentation, short cDNAs need to be concatamerised before nebulisation. Concatamerisation will be achieved by applying modifications to the 'Creator SMART cDNA' kit so that 5' ends of the cDNA can be ligated at high concentration to 3' ends. Assembly methods will be based on recently developed Short Sequence Assembly by progressive K-mer search and 3' read Extension (SSAKE v1.1) program which was successfully tested for de novo sequence assembly (Warren et al. 2006 Bioinformatics 23: 500-501). The new SSAKE version 1.1 and related programs are particularly adapted to Solexa sequencing and meet the concerns that Solexa reads show higher error rates at the end of the sequences when signal intensity decreases during a sequencing run. Other short read assembly methods developed in the Birney lab (eg 'Velvet') are currently being established and are more tolerant to higher error rates. One run should enable 60- deep sequencing of most genes which based on our experience with bacterial genome sequencing, should be sufficient to generate contigs of >1000 bp by de novo assembly.

Summary

Using the Illumina Genome Analyzer (www.illumina.com) re-sequencing 5 more races of Hp apart from Emoy2 (Noco2, Maks9, Cala2, Waco9, Hind2) will be carried out with the objective of identifying genes that show diversifying selection. To identify expression patterns within these genomic regions, a serial analysis of gene expression (SAGE)-based mRNA profiling method (Velculescu et al. 1995 Science 270: 484-487) will be established using Solexa sequencing. This novel method will be based on the SMART cDNA protocol (www.clontech.com) to obtain reads from 5' end. This will reveal where transcripts start and can also be used for semi-quantitative analysis of expression levels and to give information about when different genes are expressed during different stages of infection. To identify new and verify predicted open reading frames, a method will be established to sequence the transcriptome of the pathogen growing in planta. Transcriptome analysis of an obligate biotroph pathogen takes advantage of a newly developed method in the Jones lab for enriching genes expressed by pathogens in plants (Rougon and Jones, unpublished), which will be combined with a cDNA normalization technique. The Solexa sequencing approach relies on attachment of randomly fragmented (nebulised) DNA to a flow cell. Since short cDNAs do not fragment randomly, a method will be established which allows cDNA concatamerisation prior to random fragmentation. We would like to apply this cDNA method to pathogens whose genomes are not yet sequenced. Different methods currently available for assembling short reads that have proved useful with bacterial DNA, will be tested and adapted for cDNA de novo assembly. A computational method will be developed to combine all data into one database which allows easy access to information about variation of genome sequences between races, expression levels, gene structure and possible functions. All data will be made publicly available.
Committee Closed Committee - Plant & Microbial Sciences (PMS)
Research TopicsCrop Science, Microbiology, Plant Science, Technology and Methods Development
Research PriorityX – Research Priority information not available
Research Initiative Tools and Resources Development Fund (TRDF) [2006-2015]
Funding SchemeX – not Funded via a specific Funding Scheme
terms and conditions of use (opens in new window)
export PDF file