Hello, everyone, and welcome to the course. Because it's the very first video, I don't want to get into anything super technical or complicated. Instead, I want to just focus on some basic definitions for what statistics is all about as well as some keywords and definitions you'll be hearing a lot throughout the course. And then we'll do some examples together. So let's get started.
So what is statistics all about? It's basically just a branch of science in which you're going to collect and then do a bunch of stuff with data in order to learn more about a group of people or companies or things, something like that, in order to make informed decisions. So you collect data, and then you just do a bunch of stuff with it. Now I know that that word data sounds kind of scary because you're picturing these spreadsheets with a bunch of numbers, rows, and columns, things like that, but it's not really that complicated. Because when we use the word data, all it means is that we're collecting information.
And, yes, sometimes we go out and count or measure that information. For example, I can go out and measure the heights of 10 college students. That's data. Or I can collect them as responses. I can ask 10 people what their favorite ice cream flavors are, and that's also data as well.
Now one of the fundamental challenges of working with data and statistics is that oftentimes I can't go out and ask every single person and collect data from all the people I'm interested in talking to. Instead, I only have to focus on a smaller group. We have 2 words to describe those 2 different groups of people. The first one is called population. This is a set that contains all of the data, all of the measurements and counts or responses that you have from all of the people that you're interested in talking to.
This would be like if I went and measured the heights of every single college student in the entire United States. Now, obviously, that is incredibly impractical, and I can't do that. So instead, I just focus on a smaller group of people called a sample. So this is basically just only having a part of a population or a subset. The idea here is that if I grab a sample of 100 college students and if I only just focus on that, that will give me some information and help me learn about the wider population.
Alright? Now there are 2 other words we're going to talk about later on, but we can just jump straight into our example here. So let's get started. So with our example, what we're going to do is for a and b, these two boxes over here, we're going to label each dataset as being from a population or a sample. So let's get started.
So for part a, what we have is this box here, and it says the salary of every employee at a marketing firm. Right? So salary is just a piece of data, and it pertains to a group. Right? Every employee at a marketing firm.
Part b says the salaries, same data, from 12 out of 100 total employees at a marketing firm. So which one do you think is the population and which one do you think is the sample? Well, again, these keywords that you're going to see that are dead giveaways that you're talking about a population are words like every and each. Right? This is the salary of every employee at a marketing firm, and so this is the population.
Now you might think that a population just means that we're going to be looking at the salaries of every single marketing employee from the entire country, but it doesn't always have to be as massive as that. In this case, we're only just interested in the salaries of every employee at this specific marketing firm, which is still a population. Right? And this is a sample because, usually, we're given, you know, some number out of a total number of people that are within this population, so this is a sample. Right?
So that is the difference between sample and population. So you can kind of think about it as a population. It's like the sort of big red bubble over here that contains everybody in the company. But, again, taking the salaries of 100 employees can be a little bit impractical. So the idea is we're going to use the small sample of 12 people to get some information about the wider population.
Okay? So now let's take a look at the second sort of pair of words that we'd be working with, which is called parameter and statistic. Now a parameter is a single number that you observe from, and it's going to be a population. This would be, for example, like an average height of all the college students that I measured the heights of. Or if I said, for example, 30% of people preferred vanilla ice cream, that would be a parameter.
A statistic is basically the same idea. It's a number that you observe, but not from the population, but it's from the sample. There's a very, very easy way to remember these and never get confused. P always goes with p, so population always goes with parameter, and s always goes with s. So population parameter, sample statistic.
Right? If I talk about the average heights of everybody in the entire country, that's a parameter. If I talk about, well, out of these 12 out of 100 employees or something like that, or let's say 10 out of this group of college students, you know, and 30 percent preferred vanilla ice cream, that's a statistic. Alright? So now let's go ahead and take a look at our second pair of boxes here.
For c and d, what we're going to do is we're going to label each number as a parameter or a statistic. Okay. So what we have here is, at the bottom, we've got the average salary of all employees at a marketing firm is $41,000 And then this last one over here for part d, what we have is the average salary of 12 out of 100 employees at a marketing firm is $58,000 So, hopefully, you kinda see what's going on here. The average salary, again, this is just the single number that represents or that you're going to be observing from the entire population, which is all employees. This is referencing this population over here.
So, therefore, this is the parameter. Alright? And then that number here is $41,000. However, when we take only just a small subset of people, the 12 out of 100, we found that the average salary is $58,000 Notice how those two numbers are not the same, and oftentimes in statistics, they won't be. And that's one of the challenges.
So this is a statistic and this is a parameter. Again, it's the same idea here. This parameter is if you grab the salaries of everybody at this marketing firm, it would be that number. But if you only just focus on this group of people, it would be $58,000 Okay? So that's the idea, folks.
That's the basic difference between parameter and statistic. And now that we know the difference, let's take a look at some practice. Thanks for watching.