High Throughput SNP Genotyping:
The Illumina Golden Gate Assay
Principle of Operation:
High throughput genotyping in the DNA technologies core uses the bead array hybridization system from Illumina. In this procedure hundreds of allele specific and locus specific oligonucleotides are simultaneously hybridized to genomic DNA; these primers also contain target sequences for a set of universal primers, as well as one of a number of particular address sequences recognized by the beads in the array. Following allele specific primer extension and ligation reactions, a set of fluorescently labeled universal primers are added and PCR is carried out, generating multiple amplicons representing a pool of hundreds of different SNPs. These fluorescent products are then hybridized to the bead array, and the address sequences within the PCR amplicons bind to their cognate sequence on the bead. The fluorescence on each bead is quantified, resulting in signal associated with a particular address sequence which translates to a particular allele. More detailed descriptions of the process and applications of the technology can be found here and at the Illumina web site. Since this procedure is specifically designed for high throughput applications, economy is manifested through scale: the more SNPs assayed in the more individuals, the lower per SNP cost is obtained. The assay operates by querying SNPs in particular multiples of 96: 96, 384, 768, or 1536 discrete SNPs are scored in each sample. While final prices can vary depending on project details, this figure gives some idea of the cost/genotype as the number of SNPs and number of individuals increases. Any number of individuals can theoretically be assayed, but better economy is realized when at least 288 individuals (3 x 96 well plates) are tested.
Beginning the Project:
Projects using the Illumina platform are time and resource intensive undertakings. We recommend you first contact us to go over the scale of your project and the possible costs and time frame involved. The design of the allele specific and locus specific oligonucleotide pool to be used in the project is one of the most time consuming steps, especially when the experiment involves non-model organisms with less well characterized genomes. During the oligo design phase, there are ongoing interactions between the researcher and the bioinformaticists at Illumina to develop the best set of SNPs, i.e., those that are most likely to "convert" (give clean data) in the genotyping assay. Because of this filtering process, it is important to provide initial sequence data on more SNPs than the assay will eventually query. All bioinformatics carried out by the Illumina scientists is included in the cost of the oligo pool ordered from the company. The minimum oligonucleotide order provides enough reagent to assay 480 individuals (5 x 96 well plates). There are other consumable costs that go along with the assay, and different options on how to proceed and possible prices will be covered in our initial discussions.
Depending on the format of the SNPs to be submitted for OPA design, users can exploit the Assay Design Tool made available by Illumina. Download this document for more information on the process of preparing a list of SNPs and submitting it to Illumina. Your SNPs will be assigned scores based on their probabilities of performing well in the Golden Gate assay, as described. Local tools are also in development to further assist researchers in submission of sequences not in "Illumina-ready" format, we will have more information here as these tools become available.
Running the Samples:
We recommend that when the oligo pool is ordered it is shipped directly to us. This allows us to immediately sequester it in our pre-PCR clean room, and it's readily available for use whenever your DNA samples are delivered. Please notify us when you place the OPA order, because this is when we order the other reagents used in your assay. Because of the cost of the supplies and the variety of possible formats, we do not maintain stocks of the Illumina reagents or materials used in the assay.
Assays can be run in two formats: 96 well plate and 16 well chip. Running the assay requires use of all available slots, so the number of samples provided should be in multiples of 16 or 96. It is recommended you provide at least one duplicate sample on each 96 well plate. This helps ensure that the allele clustering software is making correct calls. Good quality and accurately quantified input genomic DNA is important. NOTE: DNA preps containing a lot of RNA or protein will not perform well in the assay (soluble low molecular weight contaminants such as salt or polysaccharides are better tolerated). Input DNA concentrations should be 50 ng/ul, although concentrations up to 250 ng/ul are tolerated and concentration normalization among samples is generally not essential. Samples should be resuspended in water, 10 mM Tris 8.0 (Qiagen EB), or TE. The assay will use 5 ul of DNA at this concentration, and we ask you provide us with enough to do at least two assays (>10 ul would be best). It would be prudent to maintain a stock of the same DNA samples in your lab, as we do not provide sample archiving guarantees. Illumina recommends that all DNA be quantified using the PicoGreen fluorometric assay. Our facility has the reagents and equipment necessary to carry out this assay, please inquire if you're interested in this option. In summary, please provide:
DNA in 96 well plate format
Sample sheet in Excel (or csv) format arrayed as A1,A2,A3,...H10,H11,H12
At least one duplicate sample/plate
If possible, at least 15 ul 50 ng/ul sample in water, TE, or 10 mM Tris 8.0 (EB)
There are data indicating that whole genome amplified DNA preps can work nearly as well as unamplified DNA; for more information on this check out this tech note from Illumina.
Output:
A successfully run Illumina assay provides a very large amount of data for downstream analysis. Our primary responsibility will be to ensure that the assay performs to specifications using standardized technical parameters and that no PCR contamination is detected (contamination detection currently available only in human and murine assays). The presence of replicates serves as an internal control for assay reproducibility, which is why we recommend their inclusion. Illumina assay specifications indicate that one or two DNA samples on a 96 well plate will show a detectable fall off in quality, depending on DNA integrity.
After running the assay, data can be analyzed at a core facility workstation, or users can acquire a copy of the Illumina software for use on their own (PC based) computers. There are certain restrictions for software acquisition, please call or email for details. We will provide basic training on the software, and be available for a certain amount of consultation free of charge as your project gets underway. It is up to the user to ultimately decide which DNA samples to use and which SNP allele calls to use in downstream analyses. We can assist in providing guidelines based on Illumina recommendations and our experience. Genotype calls can readily be exported in various formats (txt, csv, etc.) for manipulation in other programs.