2: Importing and transforming data using Power Query Editor
2.10 Append queries
2: Importing and transforming data using Power Query Editor
2.10 Append queries - Video Tutorials & Practice Problems
Video duration:
10m
Play a video:
Video transcript
<v ->In this sub lesson, we are gonna discuss</v> why we might want to append quarries together. We'll discuss common challenges in appending queries, and we'll demo how to append. Okay. So let's go through a generic example here first. So in this example here, we have two queries: query number 1, and query number 2. You'll notice that both queries have the exact same number of columns and they have the exact same names. And we're going to make an assumption here and assume that all the columns have the exact same data types. So column A's data type in query 1 matches the data type in query 2. So if this is the case and we run an append operation, we can simply bring query 2's rows into query 1 and create a brand new result set within query number 1. A more common way to do this is when you actually have differences in columns between the two queries. So in query number 1, notice we've got columns A, B, and C, and in query 2, we have columns A and B. But there's a column that exists in query 2 that is not in query number 1. So what happens in this situation? So we go ahead and run our append operation and power creator will do this for us, is it will create a brand new query with columns A B, C, and X. Column A will have the row values from both queries brought in. Column B will have the same. And C is where the first differences start to occur. So we'll see that the column C has the row values from the first query in the first three rows there. And the last four rows there are going to have blanks in them. So there'll be null values. And column X is going to be the same thing. The only difference is it has row values from query number two, but the rows that match up to query number one are going to show up as null values. So this is how power query is going to go ahead and bring our data sets together. Now as mentioned, this is a very common technique, and you're going to have to go ahead and do a little bit of cleanup work after the fact inside the power creator. And we're going to show you how to go ahead and do those things. And, in fact, what we're going to try to attempt to do is go through and show you how standardization of your names early on your process is going to enable you more towards the first example, where things just stack on top of each other neatly. So let's go ahead and get our Power BI desktop file open so we can go ahead and work our way through this sub lesson. Okay. So in the folder here that has all of our GitHub files, what I'm going to do is choose the one for sub lesson number 10 of lesson number 2, we're just going to double click on that file there, and let the Power BI desktop go ahead and open. Okay. So the Power BI desktop files open and what we want to do is, let's draw your attention to the tables on the right hand side of the screen in the fields pane here. So what I'm gonna highlight to you is that we've got United States sales right here. So this is the sales for United States, and we have a table that contains other country sales. Now from an end user's perspective, when they're going into the model to do reporting, this may be quite confusing. Ideally, what we'd like to do to make our end user self-service experience much more simplistic is to look to combine those two tables together into one. So that's where the append query technique is going to be helpful. So what we're going to do here is work our way through combining these two queries together here. So what I'm going to do here is I'm going to get myself into the power creator. Click on the home tab, transform data. And let's get ourselves into the power creator so we can go through the work of bringing those two queries together. Okay. So before we do that, let's recap the two queries that we brought in a little bit earlier. So let's go ahead first and take a look at the United States query. So we can see that we've got a column called product space key, order space date, sales space territory, with spaces in between each word, order quantity and our unit price. Now, if you recall from a previous sub lesson we purposely did not go in and change the data type of this unit price. We ideally want it to be a different data type. We ultimately want to change this to a fixed decimal, but we're going to leave it alone here right now and show you the value of actual standardization when it comes to doing things like appending. And constantly thinking about standardization, not only in data types, but in column names. Okay. So that's the United States query. Let's go down and focus in on the other query here. So we've got other country sales down below. Now, keep in mind here, we've got our product alternate key column here, and that actually has a different name than we would see in the United States sales. So if I just go back up to the United States queries here really quick, we can actually see that the column here is called product key and down here it's called product alternate key. So keep in mind when we're talking about doing the appends here, these columns aren't going to quickly and easily come together. We need to go through some standardization to actually make this work first. And the unit price column over here is already set to our fixed decimal price and the unit price over in the United States sales was left, left as text. Okay, so let's go ahead and actually go through this append process. So I'm going to click on the United States sales query here. Cause I want that to be my primary. I'm going to go to the home tab on the far right hand side of the screen, you're going to see a group of commands here called combined. What I'm going to do here is go to the append queries here in the dropdown box right next to it. I'm just going to click on it and I'm going to choose to append these queries as new. So what I'm gonna do here is create a brand new query by appending these two queries together. So we'll go append as new. And what's going to happen here is I'm going to get a dialog box that says, all right, well which queries do you actually want to append together here? So my first table is united states sales because that's the query I was highlighted on. The second table I want to bring in here is the one called other country sales. Now, technically these point, these are actually queries, even though the dialog box says first table. These are actually queries at this point. So now that I've got the two queries chosen I'm going to click okay. And we're going to see that a brand new query called Append1 is created for us. And one of the first things we want to do here's let's give this a logical name. So let's set calling it Append1, which is meaningless, let's go ahead and call this sales. Cause that's ultimately what this query is going to be used for is bringing all of our sales information together. Okay. So let's take a look and see what the result of that append were. We can see that we have the product key from one of the queries here. And I don't recall which one that's from. I believe that product key is from the United States query. So yes. So going back down to sales here, this one is from the United States sales, order date would've matched up between both files cause order space date was in both, sales territory region was the same, order quantity, and then unit price. Now the thing here about unit price is keep in mind that the data types were different between the two queries here. So what has happened here is that the any data type has been assigned to this unit price here because the power creator was not really sure what to do with this. So we're going to show you how to quickly and easily fix this, and don't fix it right here. So we're going to consolidate our cleansing steps a little bit further. And we now actually have this new column here called product alternate key which was coming from the other country sales. Once again, I'm not going to do the rename here. I'm going to go back into the original queries and do the data type change and the rename, and show how by going back into those original queries and coming back into the sales query here that we just created, everything will fix itself. Okay. So let's go back to United States sales. So what I wanted to do here is let's go to the unit price and now I'm going to go and make that data type change that we should have made earlier on. So I'm going to go ahead and go fixed decimal number, like so. It's asked me, do I want to replace current because that was the last step in our applied steps. I'm going to say replace current. Okay. So now we've got the proper data type on unit price there. So our query for United States is good. I'm going to go down to the other country sales query here and let's go ahead and standardize this name from product alternate key to product key. So it matches what was in the United States file. So I go ahead and enter now. I've done the cleaning that I really should have done a little bit earlier on when I was checking my names and data types. Now I've gone and done it. So if I now go to my sales query here which was the append, you will notice now that everything is lined up, the column that was called product alternate key has now lined up under product key, order date always lined up as did the region, as did the quantity. And now the data type for unit price has been determined cause they were both of the same data type, the fixed decimal type, in the original queries themselves. So once again, the power of good data types and coming up with standards early and often in our queries and power query is extremely beneficial. So let's just go ahead and recap what the slides again here, what we just did. Now, we've seen how this behaved in the power creator, let's just go through the slides one more time. Okay. So just as a recap here, right off the hop, we talked about appending queries and taking these two queries here and appending them together. And if the column names and data types matched up we're able to get a nice smooth and easy append happening like this and creating a brand new query. If we didn't go through the standardization process like we mentioned, or maybe sometimes it's just not possible because columns don't exist. We went ahead and did something like this. So we ended up with a situation where a brand new column was added onto this dataset. And each one of the rows that didn't have a match out of one of the other queries was left blank. Okay. So that brings us to the very end of the append queries section.