Skip to main content

My Courses

AI Tools Channels Home

3. Describing Data Numerically

Standard Deviation

3. Describing Data Numerically

Standard Deviation - Online Tutor, Practice Problems & Exam Prep

1

concept

Calculating Standard Deviation

Video duration:

8m

Video transcript

Try to tell us where the center of a data set is. But oftentimes, we'll need to know more information, like about how those values are distributed. So in this video, we're going to talk about something a little bit different, and I'm going to talk about an extremely important calculation that you need to know, a variable called the standard deviation. By the end of this video, you'll understand in plain English what the standard deviation is. I'll show you how to calculate it using some equations, and we'll go over some examples and practice problems.

So let's just jump right in. Alright? So the standard deviation is not a measure of the center. It's what's called a measure of variation. It's a number just like the mean and the median.

It's a number that represents essentially how spread out the data values are. The letter that we use for standard deviation is s, and basically s is a number that's greater than or equal to 0. And the higher it is, the more spread out the numbers are. Here's an example. Right?

I've got these 2 sets of data, 13, 14, 15, 16, 17, and then 5, 10, 15, 20, 25. You can pause the video yourself if you don't believe me, but the means of both of these sets of data are actually both 15. So they're both 15, but there's clearly something different about them. In this example over here, these the 13th through 17th, those numbers are a little bit more bunched up. And if you were to calculate the standard deviation, you would find that it's 1.58, which doesn't mean anything by itself.

Right? If you were to calculate the standard deviation for the right numbers, 5, 10, 15, 20, 25, you would find that the standard deviation is much, much higher. Okay? So here, what happens is the s is low because the numbers are less spread out. They're more bunched up around 15.

Here, the s is higher, somewhere like 8, and it's because the data is more spread out, more spread out around 15. Alright? So that's the basic idea of the standard deviation. Okay, I didn't tell you how to calculate the s's because that's actually what we're going to do in this example here.

Okay? So let's get started with our main example. We're going to calculate and find the mean and standard deviation of the sample of numbers that we find here. Alright? So let's get started with the first part over here because finding the mean is something that we've done a bunch.

So let's look at part a. To calculate the mean, remember this equation over here. If we want to figure out a mean of a sample, that's x bar, just add up everything and divide by the total number of observations. So in other words, we have 5 plus 10 plus 12 plus 14 plus 3 plus 4, and then divide by the number of values, which, in this case, n is equal to 6. Alright?

So if you add all these things up, what you're going to get is 48, divided by 6, which is a mean of 8. Alright? Now, by the way, one of the things I love to do in these problems is when you're given a list of numbers like this or if you're given a horizontal number, like, numbers that are arranged horizontally, I like to put them in a table where the numbers actually are going down the columns, so we can add them up much more easily. So we can see here that another way you could have done this is arrange them like this and then add all these things up and you would have gotten 48. Alright?

So the mean of this sample over here is 8. Alright? That's basically a measure where the center of the numbers is. That doesn't tell us how spread out the data is and that's what we're going to calculate in part b. So how do we calculate the standard deviation?

Well, here's where we're going to take a look at the equations here. There are basically 2 different forms of this equation that you're going to see. One that looks a lot nastier but actually turns out to be easier to use and a more compact version of this that's shorter but it actually involves more calculations. Now, ultimately, if your professor has a preference for your course, you have to use a certain way or another, go ahead and stick to that. But if not, I actually always find that the easiest one to use is this equation over here.

Alright? So that's what I'm going to use. They'll both get you the right answer. Okay. So what is this equation telling me?

It looks really nasty at first, but basically, it's I'm going to take the square roots of a massive set of numbers, and I've got 1 over n minus 1. So that goes out in front. But I actually know what n is because n is just equal to 6. So in other words, this just becomes 1 over 6 minus 1. Alright?

Then I've got a parenthesis over here, and I've got this giant sigma, which remember means I'm going to add up a bunch of stuff of x squared. So essentially, that's just going to be a number. I'm going to take a bunch of x squared, whatever those are, and I'm going to add them up I'm going to add them all up and I'm going to get a number out of this. And I'm going to subtract another number that's sigma x squared divided by n. So in other words, there's going to be another number over here.

I get squared and then I'm going to divide by n again, which in this case is 6. Okay? So even though this equation looks kind of scary, it's basically once you know what n is, there's really only two numbers that you have to figure out. Right? So that's what we're going to do is just to find out what those two numbers are.

Okay? Let's start with the easiest one, which actually is going to be the sigma x that's in parentheses. Because remember, we've actually already used that. Sigma x is just basically the formula that we used or the symbol that we used in our mean calculation. It's where we added up all of the numbers.

Essentially, it's this 48 over here. Once you take all your data values and you add them all up, that's 48. So in other words, this is basically just sigma x. Okay? So what we're going to do here is we're going to take this 48 and remember we're going to have to square it.

So that's one of the missing numbers that we have in the box. The only thing we have to do is just figure out what this other one is. Alright? Let's take a look at that. This 48 was sigma x, right, in parentheses squared.

That's 48 squared. That's not the same thing as this sigma x squared. Alright? What you're doing in this case is you're actually adding up all of the data values that are already squared. So in these problems, what I like to do is I like

2

Problem

Problem

Find the standard deviation of the sample below. Round your answer to the nearest tenth.

A

24.7

B

3.9

C

607.8

D

15.5

3

example

Calculating Standard Deviation Example 1

Video duration:

2m

Video transcript

Everyone, let's go ahead and see if we can figure this out. So we have 3 samples of students taking some kind of a quiz, and we're going to create histograms out of all those three samples. These histograms show the number of correct answers where basically the higher the bar, the more number of correct answers. Now, without calculating the standard deviation by hand, which would be incredibly tedious for this problem, we're supposed to just rank the standard deviations of each sample from least to greatest.

Alright? So in other words, the whole problem here is, I'm just going to have three numbers, which are going to be assorted from least to greatest. So in other words, some number is going to be less than some other number which is going to be less than some other number. And, basically, I just have to figure out s1, s2, and s3, where do they sort of go here? So let's take a look.

Alright? So I've got these three samples, and they're all kind of different. Right? I've got these numbers, which are sort of like two weird sorts of columns that are kind of off by themselves. And I've got this really sort of bunched-up column like this.

And then I've got this column that is sort of like somewhere in the middle. It kind of looks like a normal distribution. Okay? So what's going on here? Well, remember that basically, s is related to how spread out the numbers are. So, standard deviation is a measure of how spread out the data is. So if you look at each one of these things, which one is the least spread out? If you look at these numbers, I've got these sorts of two batches of numbers in which one is really low over here, and I've got these two that are kind of high in terms of numbers of questions asked. So the mean is going to be somewhere in the middle.

So in other words, the mean is somewhere like this. This is going to be x̄, and all the values are actually pretty spread out and far away from that mean. So in other words, this is the most spread out. So this is most spread out over here. Whereas, on the other hand, I have sample number 2.

I've got all of these numbers that are basically bunched up altogether. So this actually is the least spread out. And, again, just like this sample 1 over here, sample 2 also has a mean of, let's say, 4. Right? It seems to be where all the data sort of clusters.

So this is the least spread out. And this is sort of like in the middle. Alright. So again, if the s is the standard deviation, it's the measure of how spread out the numbers are. The higher the s, the more spread out.

Then, basically, what you can see here is that this one will have a high s, and this one will have a low s. So, basically, grading them from least to greatest, which one do you think is going to come first? It's the one that's going to be least spread out, whereas this one's in the middle, and this one is going to be the most spread out. So the correct order, without calculating anything, is just going to be that s from the second sample is going to be less than s3, and that is going to be less than the standard deviation from the first sample. This one is going to have the highest s.

Alright? So that's how you would sort of solve these problems. Let me know if you have any questions, and let's move on.

Your Statistics tutor

Physics and Math Lead Instructor