6.10 Iterate with a for loop - Video Tutorials & Practice Problems
Video duration:
6m
Play a video:
<v Voiceover>Loops are</v> generally frowned upon in R, because they can significantly slow down your program. That said, there are some times when loops are necessary, and most beginning R users, especially those coming from other languages, always fall into the trap of thinking loop first, vectorize second. So, while you probably want to avoid loops, we will show you how to do them because they can be necessary sometimes. Most of the examples here will be trivial situations when you would use a vectorized function anyway, but they're just to show simple examples. For the very first one, let's just print out the first 10 numbers. So, the structure of a for loop is the command for and in parathenses you put the iterator. In this case, I will say i, that will be our iterated value. It is in a vector. Now, in this case a vector is one through 10. This vector could be anything and we'll see a little bit later how it could be a character vector, it could be a numeric vector, it's any vector. And, now this i now, you could use it's value or you could use it as an indexer. Suffice it to say i goes through and takes on every value of this vector. We close the parentheses, put in our curly brace which for a single line statement isn't necessary but I use for consistency. And, we will just print out i. Simple as can be. Let's run this and we print out one line at a time one, two, three, four, so on and so on. Do note the exact same thing could be done, though it would look very different, if we printed one through 10 just automatically vectorized. We get all our numbers. Now, notice they're all in the same line, it's one vector it looks like, things act differently and that's the situation where you might want to use the loop. As another example, let us build a vector holding fruit names. So, we'll say fruit and we'll assign apple, banana, and pomegranate. (keyboard clicking) And, now let's make a second vector that's going to hold the number of characters in each element of this vector and we'll just put all NA's in there for now. So we say fruit length gets rep NA length of fruit. So, we introduced a new concept in here, rep. It repeats something. So, in this situation we're going to find the length of fruit, which if we run this we will see is three. So, we're gonna create a new vector that has NA three times. So, let's run this and see it in action. Fruit length, again I used tap completion to finish that variable, and we have three NA's. To make it even more useful, let's make the names of fruit length be the values in fruit. So, we could say names fruit length gets fruit. And, if we look at fruit length now, we see it's still all NA's but it's a named vector, so that's very nice. What we will do now is, we will loop through fruit, figure out how many characters are in that and store that in the corresponding element of fruit length. So, we build our for loop, we say for a, your variable in the for loop can be anything, it can be a, it can be i, it could be a word, it could be a complicated saying, as long as it is a legal variable you can use it. A as in fruit. So, what that means is a will first take on the value apple. Then when it's done with that iteration, it'll take on the value banana, then it will take on the value pomegranate. So, we close our parentheses and open our curly brace. And, we're going to say fruit length a, in the square brackets, and what this is doing... Remember, fruit length has names. So, when a takes on the value apple, it's finding the apple element of fruit length. We are going to assign to this the number of characters in a. So, what's happening is a right now is apple, so we should get nchar a five. And, that will be inserted into the apple element of fruit length. Let's run this. Now, it didn't print out anything, return anything, because it changed a vector. It was sort of a side effect. So, let's print out fruit length and we see that apple got five, banana got six, pomegranate got eleven. It did exactly what we wanted it to do. Now, a little interesting side effect of this, if we put in a, a sticks around, it's not local to loop. A is the last value of whatever was successfully iterated, which in this case should be pomegranate. Now while loops can be handy, let's try to do this the R way. And, we will just say fruit length two gets nchar fruit. And, we print this out, we see five, six and eleven. Sure, it doesn't have the name spelled in there but that's easy to fix. Names fruit length two gets fruit, print it out now, and we see it looks the same as fruit length. And, we can confirm that with the identical function. This checks if two elements are identical. And, there you have it, they're exactly the same. So, yes this was a trivial example of a for loop that could have easily been done with a built in vectorized function, but the idea is the same of how to iterate over a vector and apply some sort of operation on every element of a vector. Or, it could be a list, could be a data frame. The important thing to remember is that a for loop iterates over a given set of indices or values.