The entire PCR workflow is vulnerable to factors which introduce variability. Many of the variable components are unavoidable, such as the source of the sample or the requirement for a reverse transcription step. Assay design is also highly variable and can make the difference between PCR success and failure and also contributes to the reproducibility and sensitivity of an assay. The process of assay design follows a logical flow: The first step is to determine the desired target location. In some cases the sequence of the oligos is determined by the application and cannot be avoided, e.g., SNP detection, in others the entire gene may be used, e.g., copy number determination. Once the approximate assay sites are selected, the most suitable primers are identified and modifications determined. When assays are to be run in multiplex it is important to consider the potential for interaction of all oligos in the reaction and also the relative abundance of the targets. In challenging situations, e.g., where the objective is to detect very low copy numbers or small differences in target concentration, it is advisable to select and test several primer combinations and then combine with a suitable probe.
The process of assay design is greatly facilitated by adoption of suitable design software. OligoArchitect provides two options for design support. The first is OligoArchitect Online, a software design tool with a wide range of options. If the design requires a specialized capability, the second option is to request the design via OligoArchitect Consultative, utilizing the assistance of our expert molecular biologists.
The amplicon is the region of target sequence that is to be analyzed and is encompassed by the forward and reverse PCR primers. The determination of the amplicon size is, in part, dependent on the method to be used for analysis. When visualizing PCR fragments by gel electrophoresis, the PCR fragment needs to be large enough to be stained efficiently using a DNA binding dye and fit within the range of the chosen artificial size marker. Similarly, when resolving the fragment through a capillary electrophoresis instrument, the PCR product will be between 100 base pairs and anywhere over 2 kb (eventually restricted by enzyme performance).
When using qPCR for the final readout, a smaller amplicon is selected to ensure accurate quantification at each cycle. Ideally a qPCR amplicon size ranges from 75 to 200 bases in length, unless design restrictions take the primers beyond this range. In reality, the fragment size may be determined by consideration of several factors, including the biology under consideration.
The assay location may be pre-determined by the objectives of the experiment: Ideally, assays for the determination of the presence or quantity of a mRNA target are located over an exon–exon junction to avoid detection of contaminating gDNA sequences. However, these regions are often highly folded; therefore a pragmatic decision is required as to the preference for an exon spanning assay with potentially poorer performance or an exonic assay of higher quality. If the mRNA is abundant and transcribed from a single copy gene, the contribution of signal from gDNA contamination will be considerably less significant than when detecting a low abundant transcript from a multicopy gene. Detection of SNPs requires location of a probe or the 3’ of a primer over the mismatch site.
Analysis of splice variants requires a design approach that is specific to the objective. In some cases it is desirable to detect all splice variants simultaneously and this is simply achieved by selecting an exon boundary that is conserved between all variants. However, investigations into differential expression of each splice variant require a more creative approach. In one example experiment, the objective is to examine which of the alternative transcripts, shown in Figure 6.1, are being expressed. Since the exons are relatively small, amplification across all exons results in a product of around 300 bases, with smaller products resulting from amplification of splice variants. The design options for this study would be:
Figure 6.1.A schematic representation of a gene expressed as four potential splice variants. Each splice variant can be distinguished by design of specific primer pairs across each exon junction or by amplification from a generic primer pair in exons 1 and 4 and differentiation of the splice variants using qPCR, SYBR Green I dye melt analysis.
In general, amplicon sequences should be assessed using the following criteria:
Transcript-specific Amplicon Selection
Most, but not all, DNA is eliminated from the sample during RNA purification. To avoid DNA amplification during RT-qPCR, it is advisable to select primers that either flank a large intron that is not present in the mRNA sequence or that span an exon-exon junction (Figure 6.2).
Intron/exon annotations for known genes from many vertebrate, bacteria, protist, fungi, plant and invertebrate metazoan species are available at the EnsemblGenomes website (http://ensemblgenomes.org/). Alternatively, if both genomic and cDNA sequences for the target gene are publicly available, intron positions can be identified by performing a BLAST search with the cDNA sequence against the genomic database for the target organism (Figure 6.3). Intron 1 in Figure 6.3 is long enough (~6.5 kb) that DNA should not be amplified under conventional qPCR conditions or controlled PCR conditions. However, all other introns are relatively short (<1 kb), thus the DNA is likely to be amplified during RT-qPCR (for an example, Assay Optimization and Validation). Primers should either span exon-exon junctions, flank a long (several kb) intron, or flank multiple small introns.
Figure 6.2.Illustration of (A) intron-spanning and (B) intron-flanking primers for RT-PCR. Introns are in red and exons are in green. Primers P1 and P2 span an intron and primers P3 and P4 flank an intron. Note that primers P1 and P2 will not generate a PCR product from DNA unless the annealing temperature is extremely low. P3 and P4 may generate a longer PCR product from DNA if the intron is short (~1 kb), but not if it is sufficiently long (several kb).
Figure 6.3.BLAST alignment of cDNA sequence with genomic DNA sequence. The complete cDNA sequence for rat p53 from Genbank (accession number NM_030989) was used in a megaBLAST search for identical sequences in the rat genome (blastn). The alignment of the cDNA to the gDNA on chromosome 10 is shown. Using this information, the exons of the cDNA can be aligned to the corresponding gDNA regions and primer design is directed towards exons that are separated by long introns, e.g., exons 1 and 2.
DNA methylation is a crucial part of cellular differentiation, causing gene expression to be altered in a stable manner. Methylation is important for normal development in higher organisms and can be inherited. Gene regulation via DNA methylation involves the addition of a methyl group to position 5 of the cytosine pyrimidine ring or nitrogen 6 of the adenine purine ring. In adult somatic tissues, DNA methylation typically occurs in a CpG dinucleotide context whereas non-CpG methylation is prevalent in embryonic stem cells. Methylation specific assays require identification of CpG islands within the sequence, often within the gene promoter region. This information is automatically located when using Beacon Designer (Premier Biosoft) and is available at http://www.mybioinfo.info/index.php.
While the general primer and probe design suggestions described in this chapter are applicable to numerous applications including gene expression studies, SNP detection, methylation detection studies, copy number determination, monitoring viral load and splice variant quantification, each application also has specific design considerations which will be discussed separately.
For the majority of applications, primers are designed to be fully complementary to the template DNA sequences that they are intended to prime. The basic design considerations for PCR primers include:
In some cases, such as when designing single-nucleotide polymorphism (SNP) assays, there is no flexibility for the location of the assay and the surrounding sequence will also influence the sequence of the selected oligos. The recognition of the association between clinical conditions and both germline and acquired somatic SNPs continues to drive considerable efforts into the development of increasingly sensitive and specific detection systems. This reflects the challenging nature of SNP discrimination using oligo hybridization. The challenge is due to the differences in destabilization between different mismatches. Where G:A, C:T and T:T may have a strong destabilizing effect, G:T and C:A are much weaker because hydrogen bonds can form and therefore it is difficult to discriminate these pairings from the natural G:C and T:A. Many systems are adaptations of the Amplification-Refractory Mutation System (ARMS)1 that has been widely used and was instrumental in screening for cystic fibrosis mutations2. ARMS primers are 30 bases long (longer primers, up to 60 bases are functional). The base at the 3’ is SNP specific and therefore specific for the target sequence (normal or mutant base). An additional mismatch is introduced at the penultimate position. This is determined with consideration to the neighboring bases and the SNP mismatch (Table 6.1 adapted from Little, 2001).
Reprinted with permission of Current Protocols in Human Genetics. Little, S. 2001. Amplification-Refractory Mutation System (ARMS) Analysis of Point Mutations. Curr. Protoc. Hum. Genet. 7:9.8.1–9.8.12.
Additional research groups have used similar ideas and demonstrated the utility of introducing mismatches at the N-2 and N-3 positions in the primers, Liu et al.3 have performed an in depth analysis of the relative positions of mismatches for the greatest destabilization effect and, therefore, highest specificity.
Amplification of several targets simultaneously in multiplex PCR is required when there is a desire to increase throughput with more PCRs per tube or to save sample material. Primer design is the most critical factor to successful multiplex PCR. It is crucial that the general guidelines are followed and that compatibility is verified for all the primers (and probes) to be included in the reaction. In some cases it can be advantageous to use slightly longer primers with a Tm of around 65 °C. If the resulting amplicons are to be analyzed based on size discrimination, the resolving power of the analysis must be considered in the assay design. When attempting to quantify multiple targets using qPCR, the amplicons should be as similar as possible to avoid amplification bias. In addition to the primers, it is important that the template cannot adopt stable secondary structure as this would impede PCR. If it is known that the targets are present at significantly different concentrations, it may be advantageous to include a blocking primer to the high concentration target to facilitate accurate detection of the lower concentration target4.
In contrast to the coding genome, it is estimated that ~97% of the human transcriptome is composed of non-coding RNA (ncRNA)5;6,7. One member of this family is the long non-coding RNA which have been described as a class of regulatory RNA molecules. These molecules have roles in epigenetics, development, cancer and essential biological processes8,9. Long ncRNAs are traditionally defined as consisting of RNA strands of at least 200 bases10,11,12. This means that after recognition of the amplicon length, no special considerations, other than those already referred to, need to be made when designing for these targets.
In contrast, the family members that comprise the microRNAs (miRNAs) present considerable design challenges. These are short non-coding RNAs (sncRNA) of 21–23 nt that are produced via a complex cellular pathway at several stages of transcript processing. MicroRNAs negatively regulate protein translation by binding to the transcript (reviewed in Kato et al., 200813) and induce the formation of the RNA induced silencing complex (RISC)14. Commercial assays, such as the MystiCq® line are a welcome solution to the design challenges presented by miRNA. There have been several proposed schemes for qPCR for miRNA analysis15 and for those studying organisms for which there are no commercial products, there are several publications describing potential solutions. Many of these rely on addition of bases to the original miRNA by ligation of an adapter (Chapter 22, Casoldi, et al., in PCR Technologies; Current Innovations ed Nolan16) or by addition of a poly-A tail using polyA polymerase (PAP)17. The addition of a tag to each of the miRNA specific primers enables optimization of hybridization Tm and reactions containing DNA primers have been shown to be more efficient than those spiked with Locked Nucleic Acid18. In this report DNA primers specific to miRNA were designed using conventional PCR primer guidelines with additional considerations:
Since target sequences dictate the primer sequences, it may not always be possible to achieve the desired design criteria. Therefore compromises to assay design are overcome by assay-specific optimization. Some PCR targets may require special processing before a successful assay may be designed. A frequently encountered case concerns the detection of pathogens, including viruses. It is well-known that many viruses have high degrees of variability at specific locations in their genomes. A good example is the Hepatitis B virus. In a recent study, to design a successful qPCR assay against the known HBV variants19, it was necessary to conduct an extensive alignment of all of the available HBV genomic sequences. Several hundred sequences were compared using ClustalW in an attempt to find significant stretches of consensus sequence that might be used for a generic assay design. A snippet of the alignment result is shown in Figure 6.4. The asterisks (*) represent the consensus nucleotides found in the analysis of all of the genomes (a large number of other sequences that were a part of this alignment are not shown due to space restrictions).
Figure 6.4.Partial ClustalW analysis of HBV genome data. All known HBV genomic sequences were aligned using ClustalW and conserved nucleotides identified (*)
When designing primers and probes in such situations, it may be necessary to use oligos that contain mixed bases, also known as “wobbles” or degenerate bases. For example, consider the details of the consensus sequence shown for HBV (Figure 6.5).
Figure 6.5.A selected region of the HBV alignment showing regions of consensus
A region of approximately twenty-three bases is required for a primer. In this case, when considering all possibilities of sequence for all HBV genomes the actual sequence of that twenty-three base region is shown in Figure 6.6:
Figure 6.6.The permutations of primer sequence to accommodate all base options for the selected consensus primer region.
The positions of ambiguous base can be represented using standard single letter codes for mixed bases (Table 6.3). When these are applied to the sequence shown in Figures 6.5 and 6.6, the oligo can be described as in Figure 6.7.
A = adenosine, C = cytidine, G = guanosine, T = thymidine
Figure 6.7.The ambiguous bases of the consensus region oligo are represented by standard single letter codes.
This option is unlikely to result in a successful PCR primer, because there are additional considerations that need to be addressed concerning the high number of degenerate bases. In particular, a synthetic oligo manufactured using this sequence would, in fact, result in a mixture of each of the possible single base sequences. The number of possible, individual primer sequences is calculated by multiplying the individual base numbers at each position. For this sequence, this means 1×2×1×1×2×1×1×2×2×1×3×2×1×2×2×2×2×1×1×2×2×1×1 = 6,144 possible individual oligos. Therefore, the effective concentration of each specific oligo in the reaction is reduced proportionally. Empirical analysis has shown that the number of different sequence permutations in a primer should not be more than 512, therefore this example would not be an optimal degenerate-base primer. A redesign to a different location would offer a potential solution. In the example shown in Figure 6.8, the primer contains 2 bases with potential mismatches. However, these each have a single alternative base, resulting in 4 oligos in the mixed synthesis and the wobbles are located in the 5’ region of the primer. These factors offer a much higher chance of success than the primer presented in Figure 6.7.
Figure 6.8.Sequence showing alignment consensus bases and potential primer location to a consensus region. The consensus primer is shown using wobble codes
When using degenerate oligos in PCR, a modified amplification protocol may be necessary. Cycling may be started with 2–5 cycles at a low annealing temperature (35–45 °C). Also, a slow ramp from the annealing temperature to the extension temperature should be incorporated, taking approximately 3–5 minutes to reach the extension temperature. The protocol should then be finished with 25–40 cycles at a more stringent annealing temperature without the ramp modification.
It is preferable, when possible, to avoid nucleotide heterogeneity. If is not possible to avoid regions of heterogeneity, which is often the case with difficult targets, then the use of specialized oligo modifications, such as inosine and other “universal” bases, such as 5-nitroindole, may help reduce the complexity and addition of modifying groups such as Locked Nucleic Acid (see Quantitative PCR and Digital PCR Detection Methods) may improve performance.
It is possible to incorporate modified nucleotides into PCR primers. A common example is the addition of a Locked Nucleic Acid base. Locked Nucleic Acid is a modified RNA nucleotide. The ribose moiety of Locked Nucleic Acid is modified with an extra bridge connecting the 2’ oxygen and 4’ carbon (Quantitative PCR and Digital PCR Detection Methods). The bridge “locks” the ribose in the 3’-endo conformation. Locked Nucleic Acid can be mixed with DNA or RNA bases in the oligonucleotide wherever desired. The Locked Nucleic Acid modification results in increased thermal stability allowing for shorter probes to be designed with the equivalent Tm to a longer, non-modified equivalent primer. Locked Nucleic Acid-containing sequences are more specific than oligos comprised of DNA alone and ideally suited to SNP detection. Additional applications of Locked Nucleic Acid modifications include designing oligos for analysis of difficult sequences, such as viruses, where a high degree of variability can make it difficult to design a generic assay22.
As for PCR primers, qPCR probe design also depends largely on the sequence context and the desired application. Single probes such as Dual-Labeled Probes or Molecular Beacons are typically 20–30 bases long. Scorpions® Probes have a shorter probe length of 15–25 bases. In a LightCycler or FRET system, there are two probes; the sensor (probe 1) and anchor (probe 2) probes that are situated in close proximity, separated by 1–5 bases.
After a suitable probe region has been selected, complementary stems are added to the 5’ and 3’ ends to create the Molecular Beacon structure20. The example below shows the addition of a stem sequence (in red) to a Dual-Labeled Probe to create a Molecular Beacon (Figure 6.9 adapted from Thelwell 200021).
Figure 6.9.Adaptation of a Dual-Labeled Probe Assay to a Molecular Beacon Format. Nucleic acids research by Oxford University Press. Reproduced with permission of Oxford University Press in the format reuse in a book/e-book via Copyright Clearance Center.
The Scorpions® Probe requires assembly of the probe with a forward primer such that they adopt the structure: 5’ dyestem–probe-stem–quencher–blocker–primer. The primer and probe must be on opposite strands since the probe binds to the newly created template that is on the same strand as the primer. The example shown in Figure 6.10 shows the addition of label, quencher, PCR blocker and stem sequences to a Dual‑Labeled Probe to create a Scorpions® Probe (adapted from Thelwell 200021).
Figure 6.10.Adaptation of a Dual-Labeled Probe Assay to a Scorpions® Probe Format. Nucleic acids research by Oxford University Press. Reproduced with permission of Oxford University Press in the format reuse in a book/e-book via Copyright Clearance Center.
When probes are located over a region of sequence with undesired heterogeneity, ambiguous bases are managed as described for PCR primers. Similarly, Locked Nucleic Acid can be added to probes for the same reasons it is added to primers.
The 5’ of a Dual-Labeled Probe, Molecular Beacon or Scorpions® Probe is labeled with a fluorophore, usually 6-FAM™ for single assays or when multiplexing, typically choosing these in the order FAM, HEX™/JOE™, Cyanine 5 (it is critical to determine compatibility of the label with the instrument). The 3’ of a Dual-Labeled Probe or Molecular Beacon and the 3’ end of the internal stem region of a Scorpions® Probe are modified with a quencher molecule. Historically, the dye TAMRA was used as an acceptor for FAM emissions, resulting in FAM quenching. Developments in dark quencher technology have resulted in widespread adoption of Black Hole Quenchers® and the more recent introduction of our Onyx Quencher™ collection (see Quantitative PCR and Digital PCR Detection Methods).
One advantage of using relatively short-length amplicons that are typically less than 150 bases, is that it is then possible to synthesize a long oligonucleotide that may be used as a synthetic amplification target. Use of such a target may help in development and optimization of assays where the intended target may be rare or in short supply, for example in the inhibition control assay SPUD23 (see Appendix A) or infectious disease detection studies19.
When ordering custom oligos for use in PCR applications, decisions must be made regarding the desired yield/scale of synthesis, purity and required modifications. Each of these factors impacts on the other, e.g., a higher level of purification will result in better quality oligonucleotide but at the cost of a reduction in overall yield. Tables 6.4, 6.5 and 6.6 provide guidance as to the synthesis scale and expected yield of oligonucleotides manufactured by us.
When DNA is synthesized, each nucleotide is coupled to the growing chain sequentially, beginning from the 3’ end of the sequence. In each coupling cycle, a small percentage of the oligo chains will not be extended, resulting in a mixture of fulllength product and truncated sequences. After the oligo is cleaved from the support and the protecting groups are removed, purification is used to separate the full-length product from the truncated sequences. In general, the purity required for a specific application depends on the potential affect from the presence of truncated oligomers. For some applications, it is crucial that only the full-length (n) oligo be present. For others, such as PCR primers, the presence of shorter oligos (n-1,n-2,...) may not affect the experimental results.
The desalting procedure removes residual by-products that are remaining from the synthesis, cleavage and deprotection steps.
For many applications, including PCR, desalt purification is acceptable for oligos that are no more than 35 bases long, as the overwhelming abundance of full-length oligo outweighs any contributions from shorter products. Oligos required for cloning or greater than 35 bases in length require an additional method of purification, such as Reverse-Phase Cartridge Purification (RP1), HPLC, or PAGE (depending on length).
Separation on a reverse-phase cartridge removes a high proportion of truncated sequences. The difference in hydrophobicity between full-length product and truncated sequences is used as the basis of the separation. While the full-length oligo is retained on the column, the truncated sequences are washed off. The desired full-length product is then eluted and removed from the cartridge.
As the oligo length increases, the proportion of truncated sequences tends to increase. Not all of these impurities will be removed by RP1 and thus for longer oligos, such as artificial amplicon template oligos or labeled probe oligos, HPLC or PAGE purification is recommended. Reverse-phase, high performance liquid chromatography (RP-HPLC) operates on the same principle as a reverse-phase cartridge. However, the higher resolution allows for higher purity levels. HPLC is an efficient purification method for oligos with fluorophores, such as qPCR probes, as their intrinsic lipophilicity provides excellent separation of product from contaminants. Furthermore, RP-HPLC is a method of choice for larger scales due to the capacity and resolving properties of the column. The resolution based on lipophilicity will decrease as the length of the oligo increases. Therefore, RP-HPLC is usually not recommended for purifying products longer than 50 bases. Although longer oligos (up to 80 bases) can be purified using this method, the purity and yields may be adversely affected.
Anion-exchange separation is based on the number of phosphate groups in the molecule. The anion-exchange purification method involves the use of a salt-gradient elution on a quaternary ammonium stationary phase column or a similar structure. The resolution is excellent for the purification of smaller quantities. This technique can be coupled with purification by RP-HPLC, adding a second dimension to the separation process. Anion-exchange HPLC is limited by oligo length (usually up to 40mers). The longer the oligonucleotides, the lower the resolution on the anion-exchange HPLC column and therefore the lower the purity of the target oligo.
The basis of the PAGE separation is charge over molecular weight, leading to good size resolution, resulting in purity levels of 95–99% full-length product. Yields from PAGE are lower than from other methods due to the complex procedures required for extracting oligos from the gel and the removal of the vast majority of truncated products. This technique is recommended when a highly purified product is required. PAGE is the recommended purification for longer oligos (≥50 bases).
*Guarantee is for 20mers or longer. Shorter oligos may have fewer ODs.
*Guarantee is for 20mers or longer. Shorter oligos may have fewer ODs.
Note: Post-synthesis modifications may yield 50% less than the above stated values.
All probes are purified by RP-HPLC.
DNA oligonucleotides provided dry are ready for use upon re-suspension. It is recommended that oligonucleotides are resuspended in a weak buffer such as TE (10 mM Tris, pH 7.5–8.0, 1 mM EDTA). In applications where TE is not suitable, sterile nuclease-free water may be used. However, high-grade water may be slightly acidic and is not recommended for long-term storage of oligonucleotides.
A 100 μM stock solution may be obtained by using the following guideline: Take the number of nanomoles (nmol) provided (information found on the tube label and/or quality assurance document supplied with the oligo) and multiply by 10. The result provides the number of microliters of liquid to add to the tube to achieve a final concentration of 100 μM. For example, if the oligo yield is 43.5 nmol, the volume to add for 100 μM stock is 435 μL. Note that this is equivalent to a stock solution of 100 pmol/μL. The stock solution may then be further diluted as necessary, based upon the application requirements. For PCR, 10 μM or 20 μM working concentration is typically used. Store the stock solution in aliquots at –20 °C and avoid multiple freeze–thaw cycles.