So in earlier videos, we talked a lot about the different charts and graphs that we use to visualize data, and we said that a histogram was one of the ways to visualize quantitative data. And in more recent videos, we've talked a lot about frequency distributions. These are tables that organize the frequency across different classes of numbers or labels, usually just numbers. The problem with these tables, though, is they're kind of just boring. There's just a bunch of numbers and columns, and it's hard to see the different patterns and trends that you'll have to identify in the data.
That's exactly why we use histograms. So what I'm going to show you in this video is how to create a histogram out of a dataset. And, really, what we're going to do is we're going to take this table and we're going to turn it into a chart and a graph with a bunch of bars, and we're going to label the numbers over here. Alright? I'm going to show you how to do this.
And there are a couple of important definitions you'll need to know when it comes to the patterns and trends. Let's go ahead and get started here. So remember that a histogram is essentially just a bar graph or a bar chart but for quantitative data. We use vertical bars to graph frequencies, that's little f, across different classes. So in other words, a histogram is just a graphical representation of a frequency distribution.
So we're going to take this table over here and turn it into a graph. How do we do that? Well, basically, a graph is going to have some axes on the x and y axis. Let's take a look first at our datasets. We have this data of students that are studying for their exam and we have the time in minutes. We've actually already seen this exact dataset before. Before you actually start with the histogram, you should always build a frequency distribution. I'm assuming that you already know how to do this. We've actually seen this exact frequency distribution before. Alright?
So there's nothing new here. So all we're going to have to do is take this data and then turn it into a graph. Alright? In order to do that, I'm going to need the x and y axes. And basically, what happens here is that the classes or bins will go on the horizontal axis.
And what happens here is I'm going to take these classes over here and then put them on the x axis. Alright? So what happens is that if you use the class limits, like 20 to 29 and 30 to 39, you're just going to sort of cram this, this x axis over here. So a lot of times what you'll see is you'll see these things written as class midpoints. Remember, you can always calculate this by using the upper and lower limits and dividing by 2, and we've already figured out what those numbers are.
So the class midpoints are going to go here on the x axis. It's going to be 24.5, and then you're just going to write all the rest of them. And I've already done that here for you. Alright? So that is the classes or bins that go on the y axis.
What about the frequencies? Well, that's just going to go on the vertical axis. Right? So the frequencies over here are going to go your f. This is going to go on the y axis.
Alright. So this is going to be your frequency. You're just going to start with 1, 2, 3, 4, and 5. Alright. So now that you have your axes labeled, the next thing you have to do is just draw a bunch of bars that correspond to this data over here.
Alright. So in other words, the 24.5 class is going to have a height of 1 because that's a frequency. And remember, these bars are supposed to touch. So the next one, which is 34.5, is going to have a frequency of 2. You can just draw it just like this.
It doesn't have to be perfect. The next one's going to be frequency of 4. So it's going to go all the way up here like this. The next one's going to be 54.5. So that's going to look like this.
Oh, I'm sorry. With a height of 3, that's going to look something like this. And then the next one is going to be a height of 2. So that's going to look something like this. And then finally, the last one's going to be 1.
So it's going to look something like that. So this is going to be essentially what your histogram looks like. You can shade in these bars if you really, really want to. You don't have to, but this is essentially what this histogram is. So, clearly, we can see a picture that's sort of a pattern that's emerging with the data.
And you may have been able to tell this with some of the numbers here, but oftentimes with larger data, it's going to be a little sort of harder to tell very quickly that there are different patterns or trends going on. Alright? So now let's take a look at our problem here. Is this distribution, is it normal? Is it skewed? Is it uniform? Or is it none of these? I want to talk about the different shapes of distributions that you're going to see very often, and there are basically 4 of them. Histograms have 4 common distribution shapes. The first one, which we're going to talk about a lot, is a normal distribution.
You can see here that this data goes, it starts really low, peaks towards the middle, and then drops off again. So in other words, it's a bell shape, or this is basically just symmetrical. An example of this would be something like test scores, where some people score very poorly, some people score very well, but most people are usually somewhere in the middle. Alright? So the next one is called skewed.
The next 2 are skewed. The first one is called skewed rights. And this is always confusing to me because when I think of skewed right, I think it's going to peak to the right, but it's actually the opposite. Data peaks to the left and it trails off to the right. That's what skewed right means.
All right. An example of this is like annual incomes. Most people earn something that's like within something like 50,000 or 100,000 or something like that. But there's a lot of, you know, there's a few folks who earn like a million, and that pushes that data way off to the right side there. So it gets skewed.
The opposite of that is skewed left, which is basically just the reverse. The data peak is to the right, and the data trails to the left. An example of this is like life expectancies. Most people live into their later years in life, so it sort of peaks to the right there. All right?
The last one is basically a uniform distribution. This is where there's actually no sort of clear winner here. The classes have equal frequencies or roughly equal. And an example of this is like a dice roll. Right?
So if you roll a bunch of dice, the faces are 1, 2, 3, 4, 5, 6, and they're all equally probable, so they're going to form a uniform distribution. Alright? So going back to our problem here, which one of these four shapes is our histogram? Well, hopefully, you can see here that it starts from a low number, peaks towards the middle, and then it sort of drops off again, towards the right. And it doesn't seem to be skewed in any direction because the highest frequency thing is right towards the middle. So clearly, we can see that this distribution is a normal distribution. And that's the answer. All right. So that is it for histograms and how you take frequency distributions to create them.
Let's take a look at a couple of practice problems. Thanks for watching.