Skip to main content

My Courses

Chemistry

Biology

Math

Physics

Business

Social Sciences

Programming

Product & Marketing

My Course
Learn
Bookmarks

Table of contents

Skip to main content

1. Introduction to Biochemistry4h 34m
2. Water3h 23m
3. Amino Acids8h 10m
4. Protein Structure10h 4m
5. Protein Techniques14h 5m
6. Enzymes and Enzyme Kinetics13h 38m
7. Enzyme Inhibition and Regulation 8h 42m
8. Protein Function 9h 41m
9. Carbohydrates7h 49m
10. Lipids5h 49m
11. Biological Membranes and Transport 6h 37m
12. Biosignaling9h 45m
Review 1: Nucleic Acids, Lipids, & Membranes2h 47m
Review 2: Biosignaling, Glycolysis, Gluconeogenesis, & PP-Pathway3h 12m
Review 3: Pyruvate & Fatty Acid Oxidation, Citric Acid Cycle, & Glycogen Metabolism2h 26m
Review 4: Amino Acid Oxidation, Oxidative Phosphorylation, & Photophosphorylation1h 48m

5. Protein Techniques

Indirect Protein Sequencing Via Geneomic Analyses

5. Protein Techniques

Indirect Protein Sequencing Via Geneomic Analyses - Online Tutor, Practice Problems & Exam Prep

1

concept

Indirect Protein Sequencing Via Genomic Analyses

Video duration:

9m

Play a video:

Was this helpful?

Video transcript

Hey, guys. In this video, we're going to talk about indirect protein sequencing via genomic analyses. So up until this point in our course, we've only focused on direct protein sequencing methods such as tandem mass spectrometry or Edman degradation. Now direct protein sequencing is used on already extracted or isolated proteins. And direct protein sequencing is able to directly identify the sequence of unknown proteins in a sample. However, direct protein sequencing does not account for how biochemists obtain most of their protein sequencing data. And so most of the protein sequencing data is actually derived indirectly from genomic analyses or translating the nucleotide sequences of genes into amino acid sequences. And so this brings up the question, why is most of the protein sequencing data obtained via genomic analyses? Why would we obtain most of our protein sequencing data this way? Well, it turns out that it actually saves a boatload of time. It saves a lot of time. Working with DNA is actually easier than working with proteins in a lab and that's because we know that proteins are really sensitive to lots of conditions, and they can be pretty easily denatured if the temperature is off or if the pH is different. And DNA is more resistant to essentially decomposing and breaking apart. And so because DNA is more stable, it's easier to work with, and so that allows us to essentially work with DNA faster. And it turns out that DNA sequencing is actually significantly faster, cheaper, and more efficient and informative than direct protein sequencing since direct protein sequencing only allows us to obtain the amino acid sequence, but DNA sequencing allows us to obtain the nucleotide sequence. And then from that nucleotide sequence, we can derive derive the amino acid sequence using the genetic code. And so essentially, overall, genomic analyses allows us to collect more data and more protein sequencing data faster. And so that begs the question, why do we even need direct protein sequencing if genomic analysis is the best way that allows us to obtain more protein sequencing data faster? Why do we even need direct protein sequencing if indirect genomic analysis is the best at that? Well, it turns out that we can't just scrap direct protein sequencing because direct protein sequencing has its own sets of advantages. And some of those advantages include the fact that genomic analyses are not able to identify an unknown protein sample on its own. And so, because it cannot do this, that's something that direct protein sequencing is easily able to do. And that's because when we're working with genomic analyses, we're going to need a DNA sample. And so, if we only have an unknown protein or just protein, then we're not able to perform genomic analyses on these proteins. So, it's not, that's not a good thing about genomic analyses. Now, in addition to that, unlike genomic analyses, direct protein sequencing via tandem mass spectrometry can actually reveal chemically modified amino acid residues. And that allows us to identify, essentially, proteins that are genes And so genomic analyses does not reveal chemically modified amino acid residues, but direct protein sequencing can. So that's another advantage of direct protein sequencing and another reason for why we can't just scrap all of the direct protein sequencing techniques. So the rest of this video here is going to refresh our memories on how the genetic code works, which allows us to perform genomic analyses. So recall from our previous videos that the genetic code actually reveals the connection between the codons of nucleic acids and the amino acids of proteins. And so in our example below, we're going to use the genetic code to reveal the peptide sequence in the example shown over here on the right. And so, what you'll see is on the left here we have the genetic code. And recall that the genetic code is essentially reading the codon of the mRNA, and the codons have 3 nucleotides. So with this genetic code, we have the first base of the codon on the left, we have the second base of the codon, so the second base of the codon, on the top here, and then we have the third base of the codon over here on the right. And so recall that the first base of the codon limits us to one particular row here. The second base of the codon limits us to one particular column. And then the 3rd particular codon limits us to a specific position in a box. And so, what you'll see here is that we have a DNA coding sequence that's provided, and you can see that it has a 5 prime end and a 3 prime end. And so we know that this DNA coding sequence can be converted into an mRNA sequence through the process of transcription that's shown here, represented by this arrow. And mRNA sequence is going to be exactly the same as the DNA coding sequence up above, except the fact that all of the threonines, are going to be converted into U's, or uracils, because mRNA only has uracils. And so these two threonines here are going to be converted into uracils in our RNA sequence. And so, now that we have our mRNA sequence, we know that the genetic code breaks down and reads the mRNA sequence in codons, which are sets of 3 nucleotides. So our first codon are these first three nucleotides, AUG. And so again, the first base of our codon is A. And so because it's A, it limits us to this column. I'm sorry, this row. The second base of our codon is u, so we can see that here, u. And so in the second base of our codon, it limits us to one particular column. So the overlap between these two is this box right here. And then the 3rd codon is, I'm sorry, the 3rd base of our codon is g, and so that limits us to this particular position within the box, which is a u g. An AUG codon corresponds with a methionine amino acid residue, which is why we have methionine as our first residue on the n terminal end of our peptide. So moving on to our next codon, we have GCU. And so GCU corresponds with this, first residue here in this column, I'm sorry, this row. Then we have C, which limits us to this column. So now we're in this box. And then U limits us to this one particular position, GCU, which is an alanine amino acid residue. So over here, we can put an A for alanine in that position. And so essentially what we can do is continue through this process here and move on to our next codon. So the next codon is GGC, and GGC, G is here in this row. G, the second one is G, so that limits us to this column, so now we're in this box. And then C here limits us to a GGC, which is glycine, so glycine is our next residue. And now you guys are probably remembering how this works here, and so what we can do is fill out the rest of these codons here. So we have, after GGC, we have CGG, then we have AGC, and then last but not least, we have AAA. And so CGG corresponds with an arginine, so this is an arginine, CGG. And then, AGC corresponds with a serine, and then of course AAA corresponds with a lysine. And so, what we can see is that the amino acid sequence of our peptide is actually revealed through genomic analysis. We obtain the DNA sequence and we sequence that DNA, And then, through the process of transcription and translation, the genetic code, we are able to obtain the sequence of our peptide. And so this is an indirect method to be able to sequence our peptides. And that's exactly how, indirect sequencing via genomic analyses works. And so in our next couple of videos, we'll be able to get some practice utilizing the genetic code and indirect protein sequencing. So I'll see you guys in those practice videos.

2

Problem

Problem

Use the genetic code above & the coding DNA sequence below to determine the protein sequence.

Video duration:

4m

Play a video:

Was this helpful?

Problem Transcript

Alright. So this practice problem wants us to use the genetic code above and the coding DNA sequence below to determine the protein sequence. And so, notice up above, we have our genetic code, and down below, we have our coding DNA sequence. And so, what you'll also notice is that this genetic code here is specific to mRNA codons, and we know that because it has uracils in it instead of thymine. Uracils are found in RNA and thymines are found in DNA. And because we are given a coding DNA sequence, we first need to convert it into RNA to use this genetic code. Recall from our previous lessons that a coding DNA sequence, because it's coding, it's going to have the exact same sequence as the mRNA sequence, except in the mRNA, all of the thymines are going to be replaced with uracils. So if we go ahead and highlight all of the thymines in our DNA sequence, we can see there's one here, and one there, two here, and one here. So we have a total of 5 thymines, and all of these thymines are going to be replaced with uracils in the mRNA sequence. So if we provide the mRNA sequence, essentially it's going to be exactly the same. We’re going to have an A here, and the T's are going to be converted into U's. So we have a UG, and then we have GCC, so GCC, then we have UGCGUCUCAAG. This is in order from 5 prime to 3 prime. This is our mRNA sequence. Now that we have our mRNA sequence, we can use the genetic code above to read out the codons, which are sets of 3 nucleotides in mRNA. Our first codon is AUG, then our second codon will be GCC, then it's UGCGUCUCAAG. Now, we can read these codons so that we can determine what amino acids they correspond to. Again, our first codon is AUG. So the first base of our codon is going to be on the left-hand side. So because it's A, that limits us to this whole row. The second base is U, so the second base is at the top. Because it's U, it limits us to this whole row, I'm sorry, column. The last base is G, so it limits us to this exact position within the box that we're limited to. And so AUG corresponds with a methionine amino acid residue. Down below, we can put methionine, or the one-letter code M, for this codon. Next is GCC: G is this row here. C is this column, so now we're in this box. And then C limits us to this exact position, which corresponds to an alanine. So below, we can put alanine for GCC. Then we have UGC, so U, G, and C limits us right here to this position, which is a cysteine. So down below, we can put C. We then have GUC, which is G, U, and C, essentially, we're in this box, so GUC is right here, so valine. Then, CUC, so C, U, and C limits us to this position right here, so that's a leucine. And so down below, we can put L. AAG, our last residue, is going to be AA, and that limits us here. So we have AAG, so shown here. So that is a lysine, and so lysine's one-letter code is K. Essentially, when it's asking us to determine the protein sequence, the protein sequence is going to be methionine, alanine, cysteine, valine, leucine, and lysine. This here is the answer to our practice problem, and that concludes this practice. So I'll see you guys in our next video.

3

Problem

Problem

Suppose the sequence below is a template DNA sequence. What is the corresponding protein sequence?

Video duration:

5m

Play a video:

Was this helpful?

Problem Transcript

So at first glance, this practice problem might seem exactly identical to our previous practice problem, especially since the sequence of nucleotides is the same from 5' to 3'. However, there is a key difference in this practice problem. It states that the sequence below is a template DNA sequence; what is the corresponding protein sequence? In our last practice problem, the sequence given was a coding DNA sequence. Being informed that it is a template DNA sequence changes our answer entirely. Recall that the template DNA sequence is complementary and base pairs with the coding DNA sequence. The base pairing works as follows: adenines (A) pair with thymines (T), and cytosines (C) pair with guanines (G). To derive the coding DNA sequence from the template DNA sequence, we need to apply these base pairing rules. Let’s go ahead and do that below. We know that A pairs with T, T with A, G with C, C with G, repeating this sequence accordingly. This sequence here is our coding DNA sequence. We can label it here as the coding DNA sequence. Next, recall that in a double-stranded DNA molecule, the two strands are antiparallel to one another, indicating that the direction in terms of 5' to 3' is opposite. Thus, if the top strand runs from 5' to 3' from left to right, that means the bottom strand, the coding DNA sequence on the bottom, must go from 5' to 3' in the opposite direction from right to left. Thus, our 5' end is on this side, and the 3' end is on that side. To obtain a protein sequence, we need to use the genetic code, which involves converting the DNA coding sequence into an mRNA sequence. The mRNA sequence is essentially the same as the coding DNA sequence, except that thymines (T) are replaced with uracils (U). Also, remember that when using the genetic code, it reads the mRNA from the 5' end to the 3' end. We want to rewrite the mRNA sequence so that it's 5' to 3' from left to right, replacing T's with U's as we go. Thus, we start with a C, then two T's replaced by U's, giving us CUU. Following this pattern, we have GAG, AAC, GCA, GGC, and finally CAT, with T replaced by U, giving us CAU. This produces our mRNA sequence, allowing us to break it into codons: CUU, GAG, AAC, GCA, GGC, and CAU. These codons correspond to amino acids, revealed using the genetic code. Without detailed consulting the genetic code from the previous page, we know from the previous practice problem that CUU codes for leucine (L), GAG for glutamic acid (E), AAC for asparagine (N), GCA for alanine (A), GGC for glycine (G), and CAU for histidine (H). Therefore, our protein sequence from the N-terminal to the C-terminal end of the peptide is leucine, glutamic acid, asparagine, alanine, glycine, and histidine. This encodes our protein sequence represented on this side. This concludes the practice problem, and I'll see you guys in our next video.

4

Problem

Problem

Even when the sequence of nucleotides for a gene is available and genomic analyses can be performed, direct chemical techniques on the physical protein are still required to determine:

A

The molecular weight of a simple protein.

B

The N-terminal amino acid residue.

C

The total number of amino acid residues in the protein.

D

The location of disulfide bonds.

Previous Topic: Strategy for Ordering Cleaved Fragments

Your Biochemistry tutor

Jason Amores Sumpter

Biology, Biochemistry and Microbiology lead instructor

Download the Mobile app

Do not sell my personal information

© 1996–2024 Pearson All rights reserved.