2.12 Merge queries - Video Tutorials & Practice Problems
Video duration:
7m
Play a video:
Video transcript
<v Instructor>Another technique, that is available</v> for combining datasets is the Merge Queries. Merging queries is used when you wanna bring columns of data from one dataset over into another. For example, what you see on the screen is two queries, query one and query two. What we would like to do is get the name column from query number two, and bring it into query number one. As you'll notice, query number one has a column called Product Number, but there is no product name. If we would like to bring those into the exact same dataset, we could use the merge queries functionality. If you were using native Excel functions, this is where you may look to use the VLOOKUP or XLOOKUP. Merging queries is a much more flexible way to actually bring these two datasets together because we have the ability to actually join on multiple columns. In this case, we'll only really need to join on one, which is the Product Number. So, if we actually merge these two datasets together on Product Number, we would have a final result look something like this. (keyboard clattering) In this case, we take the two queries, merge them together, and we would end up with the product name in this brand new query. So we would have our Order Date from query number one. Sales Territory from query number one. Product Number from query number one. Definitely Product Number can come from either one at query one or query two. So we'll only wanna choose one of them. In this example, we chose query one. We take the Unit Price from query one, the Order Quantity from query one, the color comes from query number two. And so does the name. So now we have this new query here, which is a combination of queries one and two, and there's two different ways to do this, which we'll show you here in a moment, to get this complete dataset. So let's move into an example of this using the data that we have. I will go ahead and open up the Power Query Editor. And what we want to do is we want to take the sales information, so let's take a look at the sales query. And we can see in here that there is a product number, but there is no product name or even product sub-categories or categories. So what we're going to look to do is bring the product sub-category and category from the product list. 'Cause we can see if we just scroll over here a little bit, that we have a, product sub-category and category over here as well as the product name. And we're gonna bring those three columns over into the sales query. So we're going to do that by clicking on the sales query and on the home tab. If we just go over a little bit to the right, we're gonna see a group share called combined. What we'll do next is click on merge queries, and we have two options. So we can say merge the queries. If we choose merge queries with the sales query higher like I have on the screen here, what we'll end up doing is take the columns from the second query which we'll choose here in a moment, and merge them into this query. I could choose merge queries as new, which could bring the sales and the product list together which is the one we'll merge in here. And it would create a brand new query with the columns, from those two columns or those two tables that we actually choose. So in this case, we're just gonna say merge queries. Given that we were highlighted on the sales table, it becomes our parent table and the table that I wanna merge into this, is the product list. So we'll choose product list. And now what we need to do is choose the column that we want to use to match between the two queries. So in this case with product number, in the sales table and product number over in our product list. So what we can see here is that we are performing a join as letting you know that the 46646 rows there in the sales table, all can find a match over in the product list table. So what I'll do here now is I'm gonna click okay. And now what ends up happening is I have an additional column that is sitting on the very back of the sales query, and it is in the form of a table. So what we're gonna do here is click these column expanders and open that up. And now what we're gonna do is we get the option to choose the columns we wanna bring from the product list into the sales table. So what I'm gonna do here is I'm just gonna, click select all to get rid of all the columns. I'm gonna uncheck this use original call name is prefix like so. And the three columns I actually want to bring into my sales query are the product name, sub-category name and category name, go ahead and click okay. And we'll see now that we have the product number and at the very end of the table we have the three columns, which we just brought from the product list table. So product name, sub-category name, and category name. So some additional things to know about the Merge Queries. If I just go back up to the step here where we did the merging and clicked the gear sign right here. And take a look at how we did this merge, it is important to do the columns that we're actually using to merge, actually have the same data type. So product number here and product number here. The column names do not matter, but the data type does. And also as mentioned, we can actually join on multiple columns. So if I wanted to, let's just hypothetically say that the Order Quantity was related to this finished good flags, what they're not. I could hit the control key, click on order quantity and then click on finished goods flags if these two are related. And what we're gonna notice here is we get some subscripts next to each one of these columns. So that's letting us know that the number one rate here next to product number, matches to this one here and the number two next to order quantity, matches to this one here. So we can start adding multiple columns to this particular relationship. I don't wanna do that because these columns are not technically, columns we would want to merge on. So I'm just gonna go ahead and turn those off and get the merge setback to product number across each one of the tables. Go ahead and cancel out of there again. And we can now see as mentioned, we have the exact result that we want. So I just wanted to go back and explain the ability to merge on multiple columns, and the fact of the data size have to match. And the column name does not matter when we were doing the merging. Although, it is a best practice to make sure that we do get the column names matching before we go through that particular exercise. Because that's just something as we mentioned back in some of the previous sub lessons, we just wanna standardize their names and get them set as early as possible because we will be constantly using those names throughout all of our work, through power query editor and into the actual source data itself. And that brings an end to this sub lesson.