Positional cloning is an experimental approach that takes a list of candidate genes or a specific region on a chromosome and attempts to identify which one is the one of interest. Most commonly, they ask the question: which one is causing a disease? So you have some kind of disease phenotype, you know there's a section of a chromosome that's responsible for that, but which region and what specific gene is responsible for that phenotype? Here are the five steps to positional cloning:
The first step is to use traditional mapping techniques, either through pedigrees, if you're working with humans, or through actual crosses, if you're working with organisms you can mate in a laboratory setting. Essentially, the first step is identifying the chromosome region that's responsible or associated with the disease, and you do that through mapping. Mapping is responsible for linking DNA markers and disease phenotypes. We know we have a disease, and we study the recombinants and the recombination frequency, which allows us to determine which chromosomal region, between which DNA markers, is the gene that's responsible. This region can be huge; it can cover hundreds of genes, but it is the first step of identifying which region of the entire genome we are looking at.
The second step is, once you have this region—you finally have a region, it could have 100 genes, it could have 5, it doesn't matter how large it is—you've narrowed it down. You are focused on a region. The second step uses additional DNA markers within the loci to precisely locate. So, if you have this sequence right here, and you know it's in this region—this could have 100 genes, could have 10, it doesn't matter—you take DNA markers, and you place one here, and there might be another one here, here, and here, and you start dividing up this region with DNA markers. You then look between each one of these markers and determine which one is causing the disease. Is it between this marker and this marker? Or between this marker and this one? That's the second step—narrowing down using additional markers, additional mapping, looking at the recombination frequencies to narrow down which specific region.
Once you've identified two markers, let's say this one and this one, and you know the disease is in this region, you use a technique called chromosomal walking. This technique takes overlapping DNA fragments and identifies the gene of interest. You generally have some kind of genomic library. As a refresher, a genomic library is just a collection of short, overlapping fragments of DNA from the genome. You have a genomic library, and you find one particular clone that fits here, let's say this is clone A. Then, you look through all of your sequences and identify another clone, clone B, that overlaps with clone A. You keep doing this—then you have clone C, and you do it as many times as needed, all the way through Z and all the way back again, until you identify where the gene is.
You repeat this chromosomal walking as many times as necessary, walking step by step towards the gene, one clone at a time. In this region that you finally identify, say clone C, here and here, you're just getting smaller and smaller, so now we're between clone C. There could be one gene. It could be a small number of candidate genes. But essentially, we started out with the gene potentially being in the entire genome, but now we've narrowed it down to a short region of candidate genes or one particular gene. So you say, okay. Let's say there are three genes here, in clone C, that we've identified. Then, because three is much easier to work with than potentially 20,000, you take each gene individually and look at different phenotypes that it causes. Where in the cell or in the body is it expressed? When is it expressed? Could it potentially at all be associated with this disease? Through these additional examinations, which is step five, we can actually determine which one, is it 1, 2, or 3, causes the phenotype.
Let's just say we looked at gene 2. It was expressed where the disease is caused, and it's expressed at the right time, so gene 2 is the gene of interest.
Positional cloning, you start with potentially the entire genome, you do some mapping, you use DNA markers and chromosomal walking to sort of narrow down on the whole genome to identify the specific gene or genes that are responsible for causing the disease.
The first thing you start with is mapping. In humans, we can't do crosses. Right? So we just have to deal with the crosses that are already done. In this case, we use pedigrees for human crosses. Here is a pedigree, and we take DNA segments from every single person in this family, and probably even more than what is shown here. Through looking at recombination frequencies of different DNA markers, we determine that this region here is responsible for this disease, whatever it is, it doesn't matter. So from there, we take it further. We either use more markers, we can do chromosomal walking, but eventually, we find this gene. And this gene is responsible for this disease. And we totally just did it through the mapping of different regions through positional cloning, taking it step by step, starting out very large with a pedigree, doing some mapping, looking at recombination frequencies and markers to get here, and then you keep repeating it and keep repeating it, chromosomal walking, if you need to, eventually identifying the gene of interest that's causing this disease.
Positional cloning is best thought of as going from pedigrees to the gene. That is how that is done. So with that, let's now move on.