Sanger sequencing, also known as the “chain termination method”, is a method for determining the nucleotide sequence of DNA. The method was developed by two time Nobel Laureate Frederick Sanger and his colleagues in 1977, hence the name the Sanger Sequence.
To review the general structure of DNA, please see Figure 2.
Sanger sequencing can be performed manually or, more commonly, in an automated fashion via sequencing machine (Figure 1). Each method follows three basic steps, as described below.
Figure 1.Three Basic Steps of Automated Sanger Sequencing.
There are three main steps to Sanger sequencing.
The DNA sequence of interest is used as a template for a special type of PCR called chain-termination PCR. Chain-termination PCR works just like standard PCR, but with one major difference: the addition of modified nucleotides (dNTPs) called dideoxyribonucleotides (ddNTPs). In the extension step of standard PCR, DNA polymerase adds dNTPs to a growing DNA strand by catalyzing the formation of a phosphodiester bond between the free 3’-OH group of the last nucleotide and the 5’-phosphate of the next (Figure 2).
In chain-termination PCR, the user mixes a low ratio of chain-terminating ddNTPs in with the normal dNTPs in the PCR reaction. ddNTPs lack the 3'-OH group required for phosphodiester bond formation; therefore, when DNA polymerase incorporates a ddNTP at random, extension ceases. The result of chain-termination PCR is millions to billions of oligonucleotide copies of the DNA sequence of interest, terminated at a random lengths (n) by 5’-ddNTPs.
In manual Sanger sequencing, four PCR reactions are set up, each with only a single type of ddNTP (ddATP, ddTTP, ddGTP, and ddCTP) mixed in.
In automated Sanger sequencing, all ddNTPs are mixed in a single reaction, and each of the four dNTPs has a unique fluorescent label.
In the second step, the chain-terminated oligonucleotides are separated by size via gel electrophoresis. In gel electrophoresis, DNA samples are loaded into one end of a gel matrix, and an electric current is applied; DNA is negatively charged, so the oligonucleotides will be pulled toward the positive electrode on the opposite side of the gel. Because all DNA fragments have the same charge per unit of mass, the speed at which the oligonucleotides move will be determined only by size. The smaller a fragment is, the less friction it will experience as it moves through the gel, and the faster it will move. In result, the oligonucleotides will be arranged from smallest to largest, reading the gel from bottom to top.
In manual Sanger sequencing, the oligonucleotides from each of the four PCR reactions are run in four separate lanes of a gel. This allows the user to know which oligonucleotides correspond to each ddNTP.
In automated Sanger sequencing, all oligonucleotides are run in a single capillary gel electrophoresis within the sequencing machine.
The last step simply involves reading the gel to determine the sequence of the input DNA. Because DNA polymerase only synthesizes DNA in the 5’ to 3’ direction starting at a provided primer, each terminal ddNTP will correspond to a specific nucleotide in the original sequence (e.g., the shortest fragment must terminate at the first nucleotide from the 5’ end, the second-shortest fragment must terminate at the second nucleotide from the 5’ end, etc.) Therefore, by reading the gel bands from smallest to largest, we can determine the 5’ to 3’ sequence of the original DNA strand.
In manual Sanger sequencing, the user reads all four lanes of the gel at once, moving bottom to top, using the lane to determine the identity of the terminal ddNTP for each band. For example, if the bottom band is found in the column corresponding to ddGTP, then the smallest PCR fragment terminates with ddGTP, and the first nucleotide from the 5’ end of the original sequence has a guanine (G) base.
In automated Sanger sequencing, a computer reads each band of the capillary gel, in order, using fluorescence to call the identity of each terminal ddNTP. In short, a laser excites the fluorescent tags in each band, and a computer detects the resulting light emitted. Because each of the four ddNTPs is tagged with a different fluorescent label, the light emitted can be directly tied to the identity of the terminal ddNTP. The output is called a chromatogram, which shows the fluorescent peak of each nucleotide along the length of the template DNA.
Figure 2.DNA Structure Schematic. DNA is a molecule composed of two strands that coil around each other to form a double helix. Each strand is made up of a string of molecules called deoxyribonucleotides (dNTPs).
Each dNTP contains a phosphate group, a sugar group, and one of four nitrogenous bases [adenine (A),thymine (T), guanine (G), or cytosine (C)]. The dNTPs are strung together in a linear fashion by phosphodiester covalent bonds between the sugar of one dNTP and the phosphate group of the next; this repeated sugar-phosphate pattern makes up the sugar-phosphate backbone.
The nitrogenous bases of the two separate strands are bound together by hydrogen bonds between complementary bases to form the double-stranded DNA helix.
Reading the Sanger sequencing results properly will depend on which of the two complementary DNA strands is of interest and what primer is available. If the two strands of DNA are A and B and strand A is of interest, but the primer is better for strand B, the output fragments will be identical to strand A. On the other hand, if strand A is of interest and the primer is better for strand A, then the output will be identical to strand B. Accordingly, the output must be converted back to strand A.
So, if the sequence of interest reads “TACG” and the primer is best for that strand, the output will be “ATGC” and, therefore, must be converted back to “TACG”. However, if the primer is better for the complementary strand (“ATGC”), then the output will be “TACG”, which is the correct sequence.
In short, before starting, you need to know what you’re targeting and how you’re going to get there! So keeping this in mind, here is an example of the former example (TACG -> ATGC -> TACG). If the dideoxynucleotides labels are T = yellow, A = pink, C = dark blue, and G = light blue, you will end up with the short sequences primer-A, primer-AT, primer-ATG, and primer-ATGC. Once the fragments have been separated by electrophoresis, the laser will read the fragments in order of length (pink, yellow, light blue, and dark blue) and produce a chromatogram. The computer will convert the letters, so the final sequence is the correct TACG.
Sanger sequencing and PCR use similar starting materials and can be used in conjunction with each other, but neither can replace the other.
PCR is used to amplify DNA in its entirety. While fragments of varying lengths may be produced by accident (e.g., the DNA polymerase might fall off), the goal is to duplicate the entire DNA sequence. To that end, the “ingredients” are the target DNA, nucleotides, DNA primer, and DNA polymerase (specifically Taq polymerase, which can survive the high temperatures required in PCR).
In contrast, the goal of Sanger sequencing is to generate every possible length of DNA up to the full length of the target DNA. That is why, in addition to the PCR starting materials, the dideoxynucleotides are necessary.
Sanger sequencing and PCR can be brought together when generating the starting material for a Sanger sequencing protocol. PCR can be used to create many copies of the DNA that is to be sequenced.
Having more than one template to work from makes the Sanger protocol more efficient. If the target sequence is 1,000 nucleotides long and there is only one copy of the template, it is going to take longer to generate the 1,000 tagged fragments. However, if there are several copies of the template, in theory it will take less time to generate all 1,000 of the tagged fragments.