You might remember from biology that DNA polymerase actually requires a few things in order to synthesize DNA, including an RNA primer, a template strand, and deoxynucleotide triphosphates. It's going to use that template strand to know the correct order of bases to put together because the template strand is going to have a base sequence that's complementary to the sequence that DNA polymerase wants to synthesize. And it's going to use the deoxynucleotide triphosphates as for the polymer, and remember, it's going to be using the energy from breaking those phosphate bonds in order to actually drive this reaction. Now, it actually has to use an RNA primer because DNA polymerase can't start synthesizing a new strand of DNA on its own. It can only elongate a strand that already exists. So it's going to, or in our cells rather, we lay down a small stretch of RNA on the template strand and then DNA polymerase comes and attaches and lengthens that. Of course, it creates a strand of DNA and later the RNA is removed but you don't really need to worry about all that. You just need to understand the basics of how DNA polymerase works. You can see the reaction carried out by DNA polymerase summarized in this equation right here:
primer of length n + deoxynucleotide triphosphate → primer of length n + 1 + pyrophosphate + H +And as byproducts from that reaction, we get pyrophosphate and a hydrogen ion. These byproducts are actually going to become important later when we talk about some other DNA sequencing methods. But for now, we're going to talk about the original DNA sequencing method and that is dideoxy DNA sequencing. Dideoxy DNA sequencing is a method that relies on these dideoxynucleotide triphosphates. Normally a deoxynucleotide triphosphate is missing an OH group on the 2' carbon. In dideoxynucleotide triphosphates, they're actually missing not only the 2' hydroxyl but also a 3' hydroxyl group. Now, this is very important because during the course of DNA synthesis, we have that 3' hydroxyl, which is the attachment point for the next nucleotide. In the course of the reaction, we add our new nucleotide onto that 3' hydroxyl, creating a bond there, and then our byproducts are a hydrogen ion and pyrophosphate. If our OH group is missing on that 3' hydroxyl, we don't have anything to add a new nucleotide onto. So, when a dideoxynucleotide triphosphate is added to a nucleic acid, it's going to halt synthesis, causing the strand of DNA to stop at that particular nucleotide.
How is this actually used for sequencing? Well, in a clever way. It is a seemingly low-tech method of sequencing DNA, but there's very sophisticated thinking behind it. You run four DNA synthesis reactions. In each, you use some deoxynucleotide triphosphates at a millimolar concentration and a very small micromolar concentration of dideoxynucleotide triphosphates. So you have mostly the normal ones that DNA polymerase wants to use for synthesis and just a small amount of these dideoxynucleotides to halt synthesis whenever they're added. This results in a small chance of adding a dideoxynucleotide triphosphate, meaning that in the big vat where this DNA synthesis is carried out, only a small portion of the strands of DNA you synthesize are going to be truncated. You do this reaction four times because in each, you only want to add one type of dideoxynucleotide triphosphate—Thymine in reaction 1, Cytosine in reaction 2, Adenine in reaction 3, and Guanine in reaction 4. You have some way of identifying which is which. You separate these reactions so that you only have one type of dideoxynucleotide triphosphate you're using, making it easier to parse out each one. Basically, what you end up with are strands of different lengths from each DNA synthesis reaction. You separate them using something like gel electrophoresis to determine the order of lengths and from that, you can actually determine the sequence. So you see the primer, how long the primer is, and then you see a strand that came from the reaction with dideoxy G, right? Like number 4 here. That strand is one nucleotide longer than the primer, so you know that that's going to be a G. And then from that same batch of DNA, you have one that's 2 nucleotides longer than the primer, and so on. In this way, you're actually able to determine the sequence of the DNA. It's a roundabout way of determining the sequence if you think about it, which is why I say it's not the most high-tech method, but there's sophisticated thinking behind this.
Alright. So let's actually turn the page and talk about some specific ways in which this is used.