So as we talked about in our overview of hypothesis testing, every problem begins with this first step, which is we have to write our hypotheses. Now this is a really important step that determines how the rest of the problem is going to go. And unfortunately, there's a lot about this that isn't always intuitive, such as how and why we even write these two statements based on the problem text. But don't worry because I'm going to break all of this down for you. I'm going to show you exactly, using these problems, how to do this step.
So let's just jump right in and we'll get started here. Okay. So every hypothesis test begins with writing these two statements that are going to be mathematical with symbols, and basically, we're going to have to extract these from the problem text. So let's take a look at these two statements in more detail. The first one is always a claim that's being made about a population.
So for example, if I claimed that 30% of everybody in the United States likes vanilla ice cream, that's a claim that's being made about a population. We call that the null hypothesis. And, basically, there are a couple of symbols that we're going to use for this. The one that you're always going to see is \( H_0 \) or \( H_0 \) or \( H_0 \). And the way that this is always going to be written or usually going to be written is it's going to be a parameter, and it's going to be a parameter about a population.
So in other words, it's going to be \( \mu \) like a mean, \( p \) like a proportion, or \( \sigma \) like a standard deviation. And the symbol that we're going to use is always going to be an equal sign, and then it's just going to be some value from the problem. So for example, if I claim that 30% of people like vanilla ice cream, that's a proportion, so that's \( p = 0.30 \). So it's always a letter like \( \mu \), \( p \), or \( \sigma \) equals some number from the problem. Some words for this you may see are kind of like the default assumption or the status quo.
It's basically just giving you a number that you're going to assume to be true for the rest of the problem so that you can challenge it with the second statement. We'll talk about that in a second. Alright? So before we actually even get to the alternative hypothesis, let's just jump right into our problem and see if we can figure out what the null hypothesis is. Alright?
So we're going to write our two statements using these problems. Let's get started. So for the first one, you're a researcher investigating the average age of students at your university. The enrollment office claims that the mean age is 23. You're looking to test if current students are younger than this claimed average.
So again, we're going to write the null hypothesis. It's going to be a parameter equals some value. The first thing is what is the parameter that we're looking at? Again, there are 3 options here. Is it a \( \mu \), \( p \) or is it a \( \sigma \)?
In this case, we're looking at the mean age, so mean means we're looking at a \( \mu \). So the parameter we have here is we're looking at a \( \mu \). And what's the value? Well, the enrollment office claims that the population mean age is 23, so that's the value. So in other words, our null hypothesis, the thing that we're assuming to be true from now on, is that \( \mu = 23 \).
That's all there is to it. So that's the null hypothesis. Again, that's the status quo, the thing we're going to eventually try to find evidence against or trying to challenge. Alright? So that leads us to the second step or the second statement, which is basically the alternative hypothesis.
This is essentially just an opposing claim that you're trying to find evidence for. Alright? So the way this is usually written is like \( H_a \) instead of \( H_0 \), so you can distinguish the 2. And basically what's going to happen here is whatever parameter you're using for the null, you use in the alternative, whatever value you use in the null, you also use in the alternative. The only thing that's different is one of the 3 symbols that's going to go in the middle over here because it's going to be one of 3 possible things.
It's either going to be a less than, a greater than, or it's just going to be a not equals. So for example, in my vanilla ice cream problem, the \( p \) is still going to be the same. The 0.30 is going to be the same, but one alternative hypothesis is that the actual proportion of people who like vanilla ice cream is not 0.30 or 30%. So I would just use a not equal to. That's one of the three possibilities for those symbols.
Alright? So, basically, what's going to happen here is that in order to figure out which one of these symbols it is, you're going to have to interpret and look at the problem text. Alright? So, again, the enrollment office claims, just to go back to our problem, that the mean age is equal to 23. That's our null hypothesis.
You're looking to test, in other words, you're looking to find evidence that current students are actually younger than this claimed average. So these words like younger or older, taller, shorter, greater than, less than, more, fewer are always keywords that are going to help you figure out which one of these symbols it is. In this case, we're looking to find that the mean age is actually less than 23. So basically, what's the parameter? We're still going to use \( \mu \).
We're still going to use 23. All that happens here is that we're trying to figure out now we're trying to find evidence for the fact that \( \mu \) is actually less than 23. So that's our null hypothesis. That's our alternative hypothesis. Alright?
So basically, what happens is your null hypothesis is always going to have that equal sign in it, and your alternative hypothesis is always going to have something else. Alright? So that's really all there is to it. So basically, one thing you might be wondering here is why do you actually have to write these two statements? And really what's going to happen is what we're going to see is that this statement over here, the null hypothesis, gives us a value to test.
It gives us a value to assume is true, And this alternative hypothesis tells us how we're going to test against it, and we'll see that in just a second. Alright? So now problems won't always be this straightforward here, so let's take a look at our second example. So we're going to have a business journal that wants to estimate the percentage of companies with female CEOs within the United States. So they want to prove that greater than 20% of companies nationwide have a female CEO.
Alright? So these problems don't always explicitly tell you, hey. This thing is making this claim about the population, and then you're looking to find a sample against them. Sometimes it'll be a little bit more nuanced, and you'll have to look into the problem a bit more. So in this case, we have a business journal.
We're looking at a percentage. So in other words, what's the parameter? We're looking at not a mean or a standard deviation, but we're looking at a proportion. So this is going to be a parameter. This is going to be a \( p \).
Percentage, usually when we're dealing with percentages, that's a dead giveaway that it's a proportion. Right? What's the value? In this case, it is actually pretty obvious because obviously the value here is at 20%, or you would write this as a decimal 0.20. So what is the null hypothesis?
What's the default assumption that you're assuming is true? Well, basically, it's just that the proportion is equal to 0.20, and the claim that they're trying to find evidence for or prove is the fact that the proportion is actually greater than, 0.20. So, again, that's that word over here, this greater than 20%, is going to tell you basically what's the alternative hypothesis. Alright? So now that we have a good understanding of how to write the null and alternative hypotheses, let's go ahead and take a look at some practice.
Thanks for watching, and let's move on.