-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Microsoft Power BI Cookbook
By :

The full power of Power BI's querying capabilities is in the integration of distinct queries representing different data sources via its merge and append transformations. Retrieval processes that consolidate files from multiple network locations or integrate data from multiple data sources can be developed efficiently and securely. Additionally, the same join types and data transformation patterns SQL and ETL developers are familiar with can be achieved with the M language. This recipe provides examples of combining sources into a single query and leveraging the table join functions of M to support common transformation scenarios.
To follow along with this recipe, you can use the Merge Queries and Append Queries icons on the Home tab of the Power Query Editor to generate the join expressions used in this recipe. However, as joining queries is fundamental to the retrieval process, it is recommended to learn how to use the Table.Join
, Table.NestedJoin
, and Table.Combine
functions.
To get ready for this recipe, import the DimCustomer
and FactCallCenter
tables from the AdventureWorksDW2019 database by doing the following:
AdWorksDW
. This query should be similar to the following:
let
Source = Sql.Database("localhost\MSSQLSERVERDEV", "AdventureWorksDW2019")
in
Source
AdWorksDW
and choose Reference, select the DimEmployee
table in the data preview area, and rename this query DimEmployee
.Let
Source = AdWorksDW,
dbo_FactInternetSales = Source{[Schema="dbo",Item="FactInternetSales"]}[Data],
#"Filtered Rows" =
Table.SelectRows(
dbo_FactInternetSales, each
[OrderDate] >= #datetime(2011, 1, 1, 0, 0, 0) and
[OrderDate] <= #datetime(2011, 12, 31, 0, 0, 0))
in
#"Filtered Rows"
Table.SelectRows
expression to filter dates between January 1, 2012 and December 31, 2012.For additional details on performing these steps, see the Managing Queries and Data Sources recipe in this chapter.
The goal of this example is to produce an integrated table based on three "Sales" queries. While in this example the three queries come from the same data source, it is important to realize that the three queries could point to completely different data sources, such as three text files or even a text file, an Excel spreadsheet, and a SQL database. What is important is that the three queries have the same column names and number of columns. To implement this recipe, perform the following steps:
Table.Combine
(or Append Queries feature in the ribbon) function to return a single table based on the rows of the Sales2011, Sales2012, and Sales2013 queries.
Let
Source = Table.Combine( {Sales2011, Sales2012, Sales2013} )
in
Source
Table.AddColumn
function.
ManagerName =
Table.AddColumn(
dbo_DimEmployee,
"Manager Name", each [FirstName] & " " & [LastName]
)
Table
.SelectColumns
function.
SelectCols =
Table.SelectColumns(
ManagerName,
{ "EmployeeKey", "Manager Name", "Title" }
)
Managers
query and DimEmployee
query using the Table.NestedJoin
function or the Merge Queries feature in the ribbon.
Let
Source =
Table.NestedJoin(
DimEmployee, "ParentEmployeeKey", Managers, "EmployeeKey",
"ManagerColumn", JoinKind.LeftOuter
)
in
Source
Table
.ExpandTableColumn
function to add the Manager Name
and Manager Title
columns.
ManagerColumns =
Table.ExpandTableColumn(
Source, "ManagerColumn",
{ "Manager Name", "Title" },
{ "Manager Name", "Manager Title" }
)
EmployeeName =
Table.AddColumn(
ManagerColumns,
"Employee Name", each [FirstName] & " " & [LastName]
)
RenameTitle =
Table.RenameColumns(
EmployeeName,
{ "Title", "Employee Title" }
)
EmployeeKey
, Employee Name
, Employee Title
, Manager Name
, and Manager Title
columns.
SelectCols =
Table.SelectColumns(
RenameTitle,
{
"EmployeeKey", "Employee Name", "Employee Title",
"Manager Name", "Manager Title"
}
)
For the FactInternetSales2011to2013 query, only a single expression is required using the Table.Combine
function. No other expressions are necessary in this example given that the staging queries have identical column names and the same number of columns. The Table.Combine
function performs an append operation and does not remove duplicate rows similar to a SQL UNION statement.
Any columns which are unique to one of the input tables in a Table.Combine
function will be added to the result set with null values for the rows from the other tables. Depending on the scenario, the developer could apply the Table.Distinct
function to avoid any duplicate rows from reaching the data model.
The Employees query references the DimEmployees query as the left table in a Table.NestedJoin
function, and is joined to the Managers
query via a left outer join. The left join is required to retain all employee rows in this scenario, as the DimEmployees
table includes one employee that does not have a parent employee key, the Chief Executive Officer
.
The join is performed on the ParentEmployeeKey
column in the DimEmployees query to the EmployeeKey
column in the Managers query. After this step, all of the rows from the matching Managers table are stored within the row as a Table
object in the column ManagerColumn. When expanding the ManagerColumn column using the Table.ExpandTableColumn
function, the Manager Name column can retain the same name, but the Title column is renamed to Manager Title in order to avoid conflicting with the Title column in the DimEmployees query.
In implementing the table joins, you can choose to use the Table.Join
and Table.NestedJoin
functions. All six join types—inner, left outer, right outer, full outer, left anti, and right anti—are supported by both functions. The Table.NestedJoin
function enters the results of the join (the right or second table) into a new column of table values and will use local resources to execute the join operation, unless the Table.ExpandTableColumn
function is used to replace this new column with columns from the right table. A left outer join type is the default if the JoinKind
parameter is not specified. For performance reasons, Table.NestedJoin
should not be used without a Table.ExpandTableColumn
function removing the column of tables.
Conversely, the Table.Join
function automatically expands the left table with the columns from the right table input (a flat join) and defaults to an inner join if the JoinKind
parameter is not specified. The Table.Join
function gets folded to the source without any additional functions but requires that there are no matching column names between the joined tables for a JoinKind
other than inner join. For inner joins, the matching column names from both tables must be specified in the join key parameters. A Table.SelectColumns
function is required to exclude any columns from the right table added with the join.
Whether implemented via Table.NestedJoin
or Table.Join
, developers should look to use inner joins if the source tables have referential integrity, such as with foreign key constraints and whether this meets requirements. For joins against larger tables, developers should confirm that query folding is occurring and can evaluate the different query plans generated by alternative retrieval designs in terms of performance.
Note that the two rows for Rob Walters are due to a Slowly Changing Dimension (SCD) Type 2 process applied in the source database. For more information on SCDs, refer to this Wikipedia article: https://bit.ly/3yIQeI5.
Rather than creating separate lookup/join staging queries, it is possible to consolidate these expressions into a single let…in
M expression. For example, the following single query returns the exact same results as the Sales2011, Sales2012, Sales2013, and FactInternetSales2011to2013 queries:
let
Source = AdWorksDW,
Sales = Source{[Sche"a=""bo",It"m="FactInternetSa"es"]}[Data],
Sales2011Rows =
Table.SelectRows(
Sales, each
[OrderDate] >= #datetime(2011, 1, 1, 0, 0, 0) and
[OrderDate] <= #datetime(2011, 12, 31, 0, 0, 0)
),
Sales2012Rows =
Table.SelectRows(
Sales, each
[OrderDate] >= #datetime(2012, 1, 1, 0, 0, 0) and
[OrderDate] <= #datetime(2012, 12, 31, 0, 0, 0)
),
Sales2013Rows =
Table.SelectRows(
Sales, each
[OrderDate] >= #datetime(2013, 1, 1, 0, 0, 0) and
[OrderDate] <= #datetime(2013, 12, 31, 0, 0, 0)
),
Append = Table.Combine( {Sales2011Rows, Sales2012Rows, Sales2013Rows} )
in
Append
Inline query approaches are helpful in limiting the volume of queries, but you lose the management benefits provided by group folders and the Query Dependencies view. The graphical support makes it easier to explain and quickly troubleshoot a data retrieval process over a single but complex M expression. Staging queries are recommended for all but the most trivial projects and retrieval processes. Staging queries should generally never be loaded to the data model, as staging tables could both confuse the user and would require the data model to use additional resources to process and store the additional data.
Similarly, merge queries can also be combined into a single query. The following table breaks down the six different join types that can be specified in both the Table.NestedJoin
and Table.Join
functions. Both the Parameter and Parameter Value can be used, though the recipes in this book use Parameter as this makes the expressions easier to follow.
Join type |
Parameter |
Parameter value |
Inner |
|
0 |
Left Outer |
|
1 |
Right Outer |
|
2 |
Full Outer |
|
3 |
Left Anti |
|
4 |
Right Anti |
|
5 |
Table 2.3: Power Query (M) join types, parameters, and parameter values
One final note is that, for data source files with the same structure stored in the same network directory folder, Power BI offers the Combine Binaries transformation, which can be used with text, CSV, Excel, JSON, and other file formats. This feature can be used when creating a Folder query. The Combine Binaries feature automatically creates an example query and a function linked to this query, such that any required modification to the source files can be applied to all files, and the source location of the files can be easily revised.
Table.Combine
: http://bit.ly/3c6L2o0Table.NestedJoin
: http://bit.ly/30ZO3jZTable.Join
: http://bit.ly/3lxMRh7Table.ExpandTableColumn
: http://bit.ly/3eY6u0zTable.AddColumn
: http://bit.ly/3vGJZ6bTable.SelectColumns
: http://bit.ly/38Qk7LtTable.RenameColumns
: http://bit.ly/3rTVfd4Change the font size
Change margin width
Change background colour