In the last couple of videos, we talked a lot about how we can visualize different data using various charts and graphs. However, often when we interpret data, we'll have to do it more numerically. So we're going to shift focus in this video. I'm going to start talking about how we can calculate certain important variables from the datasets. The first one we'll talk about is called the mean.
And luckily, that's a word you've probably heard and seen before. In this video, I'm going to show you how to calculate the mean from datasets. We'll discuss some different notations that we need to know, some important conceptual information, and then we'll just do some examples. Let's get started here. The mean, that's probably something you've heard at some point in math class, maybe even in grade school is basically just an average of a dataset.
All right? And when you take an average of a set of numbers, all you're going to do is add up all those values, and then you're going to divide by the total number of values. Alright? So let's just go ahead and look at this example here. Let's say I have a sample of numbers. I've got five, ten, 12, 14, and three. I've got five numbers in this dataset. And to calculate a mean or an average, we're going to use the word mean here, I'm just going to take all those numbers and add them together first. So five plus ten plus twelve plus fourteen plus three. Then I'm going to divide it by the total number of values that I have, which in this case is five.
Alright? So what happens here is when you plug this into your calculator, make sure you do that top part in parentheses because of the order of operations, but you should get a number that's 44 divided by five. Then when you calculate that, you're just going to get 8.8. Alright? So in other words, this dataset here, whatever it represents, has a mean of 8.8.
Alright? That's how you calculate that. Now a lot of times in math, we're going to take these complicated lists of instructions, and we'll turn them into shorter equations with symbols. So I want to talk to you about what the mean is. The mean, when you see it, you're going to see this sort of symbol here, x with a little bar on top of it, and we just call it x bar.
And basically, the equation here mathematically can be represented as: x ¯ = ∑ i x i / n This expression simply means summation of all x values divided by n, the total number of values.
We've seen that before. Alright. So that's what that equation means. So this x bar here, when we calculated this, was just a mean of 8.8. Alright.
So what does that mean? Well, the mean is what we call a measure of center or measure of central tendency. And basically, it's a fancy way of saying it summarizes a dataset in one central value. We have numbers in the sample that range from three all the way to fourteen. That's where my min and my max are, and I calculated a number of 8.8, which is, more or less, in the center or in the middle. That's what a mean as a measure of center means. Alright? So that's really it for this first example. Let's go ahead and move on to our second one over here.
So what I want you to do is imagine that this sample of data over here is actually part of a larger population. So now we actually have an extra number in the mix here. We've added this seventy-six. But ultimately, whether you're dealing with a population or a sample, the mean is always the same. You're just going to add up everything and divide by the total number.
So we've got five plus ten plus twelve plus fourteen plus three plus seventy-six. Again, we're going to put that all in parentheses. And now when you divide by the total number, what do you think you're dividing by? Is it five? Well, be careful here because we've added another number into the mix. There's actually six data values here. So one of the things that you might see is you might see little n's with samples and you might see big N's with populations. But ultimately, you calculate the mean the exact same way. So you're just going to divide by six over here. This ends up being 120 over six, which when you calculate the mean is going to be 20.
Alright? So just some notation here, whenever you see a population, you may see some different symbols attached to this. You may see 'mu' instead of x bar, and then you may see big N instead of little n. So basically, if you see this equation here where mu is equal to Sigma X over big N, don't freak out because all that's happening here is you're just calculating the mean. You don't really need to know when to use one versus the other.
And if you're ever unsure, x bar is probably your safest bet here. Alright? So just wanted to let you know that. Okay. Cool.
So let's talk about these means here for a second. When we calculated the sample, we got 8.8. Then when we threw in this extra number of seventy-six, we got a mean of 20. So what's going on there? Basically, what happens is while the mean uses all the data values, any extreme values that you have, any outliers like the seventy-six over here, are going to significantly change your mean.
Alright? And we threw in the seventy-six. That's a number that's so big relative to the other numbers that it kind of shifts the mean, and you still end up with a number that's sort of in between three and seventy-six, but that seventy-six has shifted the mean by a lot. And now you have a mean that's twenty instead of 8.8. So that's something you'll have to be aware of.
Alright? So that's it. That's it for the introduction as to calculate the mean. Let's go ahead and take a look at some practice problems.