-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Microsoft Power BI Cookbook
By :

Every time you click on a button to connect to any of Power BI Desktop's supported data sources or apply any transformation to a data source object, such as changing a column's data type, one or multiple M expressions are created reflecting your choices. These M expressions are automatically written to dedicated M documents and, if saved, are stored within the Power BI Desktop file as Queries. M is a functional programming language like F#, and it is important that Power BI developers become familiar with analyzing, understanding, and later, writing and enhancing the M code that supports their queries.
To prepare for this recipe, we will first build a query through the user interface that connects to the AdventureWorksDW2019 SQL Server database, retrieves the DimGeography table, and then filters this table to a single country, such as the United States:
Figure 2.1: SQL Server Get Data dialog
Uncheck the (Select All) option to deselect all values and then check the box next to a country, such as the United States, before clicking the OK button.
Figure 2.2: Filtering for United States only in the Query Editor
Note that the button for the EnglishCountryRegionName column changes to display a funnel icon. Also notice that, in the Query Settings pane on the right side of the window, a new option under APPLIED STEPS has appeared called Filtered Rows.
Figure 2.3: The Query Settings pane in the Query Editor
There are two methods for viewing and analyzing the M functions comprising a query; they are as follows:
The formula bar exposes the M function for the current step only. This formula bar appears just above the column headers for the preview data in the central part of the window. If you do not see this formula bar, click the View tab and check the box next to Formula Bar in the Layout section of the ribbon. All such areas of interest are boxed in red in Figure 2.4.
Figure 2.4: The Power Query Editor formula bar
When the Source step is selected under APPLIED STEPS in the Query Settings pane, as seen in Figure 2.3, we see the connection information specified on the initial dialog after selecting Get Data and then SQL Server. The M function being used is Sql.Database
. This function is accepting two parameters: the server name, localhost\MSSQLSERVERDEV
, and the database name, AdventureWorksDW2019
. Clicking on other steps under APPLIED STEPS exposes the formulas for those steps, which are technically individual M expressions.
The formula bar is useful to quickly understand the M code behind a particular query step. However, it is more convenient and often essential to view and edit all the expressions in a centralized window. This is the purpose of the Advanced Editor. To launch the Advanced Editor, follow these steps:
Figure 2.5: Advanced Editor on the Home tab of the Query Editor
Figure 2.6: The Advanced Editor view of the DimGeography query
As shown in Figure 2.6, using the Advanced Editor will mean that all of the Power Query code that comprises the query can be viewed in one place.
The majority of queries created for Power BI follow the let...in
structure, as per this recipe. Within the let
block, there are multiple steps with dependencies among those steps. For example, the second step, dbo_DimGeography, references the previous step, Source. Individual expressions are separated by commas, and the expression referred to following the in
keyword is the expression returned by the query. The individual step expressions are technically known as "variables".
Variable names in M expressions cannot have spaces without being preceded by a hash sign and enclosed in double quotes. When the Query Editor graphical interface is used to create M queries, this syntax is applied automatically, along with a name describing the M transformation applied. This behavior can be seen in the Filtered Rows step in Figure 2.6. Applying short, descriptive variable names (with no spaces) improves the readability of M queries.
Note the three lines below the let
statement. These three lines correspond to the three APPLIED STEPS in our query: Source, Navigation, and Filtered Rows. The query returns the information from the last step of the query, Filtered Rows. As more steps are applied, these steps will be inserted above the in
statement and the line below this will change to reference the final step in the query.
M is a case-sensitive language. This includes referencing variables in M expressions (RenameColumns
versus Renamecolumns
) as well as the values in M queries. For example, the values "Apple" and "apple" are considered unique values in an M query.
It is recommended to use the Power Query Editor user interface when getting started with a new query and when learning the M language. After several steps have been applied, use Advanced Editor to review and optionally enhance or customize the M query. As a rich, functional programming language, there are many M functions and optional parameters not exposed via the Power Query Editor's graphical user interface. Going beyond the limits of the Power Query Editor enables more robust data retrieval, integration, and data mashup processes.
The M engine also has powerful "lazy evaluation" logic for ignoring any redundant or unnecessary variables, as well as short-circuiting evaluation (computation) once a result is determinate, such as when one side (operand) of an OR
logical operator is computed as True
. Lazy evaluation allows the M query engine to reduce the required resources for a given query by ignoring any unnecessary or redundant steps (variables). The order of evaluation of the expressions is determined at runtime—it doesn't have to be sequential from top to bottom.
In the following example, presented in Figure 2.7, a step for retrieving Canada
was added and the "Filtered Rows"
step for filtering the results for the United States
was ignored. Since the CanadaOnly
variable satisfies the overall let
expression of the query, only the Canada
query is issued to the server as if the "Filtered Rows"
step were commented out or omitted.
Figure 2.7: Revised query that ignores the "Filtered Rows" step to evaluate Canada only
As a review of the concepts covered thus far and for future reference, Table 2.1 presents a glossary of the main concepts of the M language utilized in this book.
Concept |
Definition |
Expression |
Formulas evaluated to yield a single value. Expressions can reference other values, such as functions, and may include operators. |
Value |
The result of evaluating an expression. Values can be categorized into types which are either primitive, such as text ("abc"), or structured kinds, such as tables and lists. |
Function |
A value that produces a new one based on the mapping of input values to the parameters of the function. Functions can be invoked by passing parameter values. |
Type |
A value that classifies other values. The structure and behavior of values are restricted based on the classification of their type, such as Record, List, or Table. |
|
An expression that allows a set of unique expressions to be assigned names (variables) and evaluated (if necessary) when evaluating the expression following the in expression in a let...in construct. |
Variable |
A unique, named expression within an environment to be conditionally evaluated. Variables are represented as Applied Steps in the Query Editor. |
Environment |
A set of variables to be evaluated. The global environment containing the M library is exposed to root expressions. |
Evaluation |
The computation of expressions. Lazy evaluation is applied to expressions defined within let expressions; evaluation occurs only if needed. |
Operators |
A set of symbols used in expressions to define the computation. The evaluation of operators depends on the values to be operated on. |
Table 2.1: M Language elements
M queries are not intended as a substitute for the data loading and transformation workloads typically handled by enterprise data integration and orchestration tools such as Azure Data Factory (ADF), Azure Databricks, or SQL Server Integration Services (SSIS). However, just as BI professionals carefully review the logic and test the performance of SQL stored procedures and ETL packages supporting their various cubes and reporting environments, they should also review the M queries created to support Power BI models and reports. When developing retrieval processes for Power BI models, consider these common ETL questions:
One of the top performance and scalability features of M's query engine is called Query Folding. If possible, the M queries developed in Power BI Desktop are converted ("folded") into SQL statements and passed to source systems for processing.
If we use the original version of the query from this recipe, as shown in Figure 2.6, we can see Query Folding in action. The query from this recipe was folded into the following SQL statement and sent to the server for processing, as opposed to the M query engine performing the processing locally. To see how this works, perform the following:
Figure 2.8: View Native Query in Query Settings
Figure 2.9: The SQL statement generated from the DimGeography M query
Finding and revising queries that are not being folded to source systems is a top technique for enhancing large Power BI datasets. See the Pushing Query Processing Back to Source Systems recipe of Chapter 11, Enhancing and Optimizing Existing Power BI Solutions, for an example of this process.
The M query engine also supports partial query folding. A query can be "partially folded", in which a SQL statement is created resolving only part of an overall query. The results of this SQL statement would be returned to Power BI Desktop (or the on-premises data gateway) and the remaining logic would be computed using M's in-memory engine with local resources. M queries can be designed to maximize the use of the source system resources, by using standard expressions supported by query folding early in the query process. Minimizing the use of local or on-premises data gateway resources is a top consideration for improving query performance.
There are limits, however, to query folding. For example, no folding takes place once a native SQL query has been passed to the source system, such as when passing a SQL query directly through the Get Data dialog using the Advanced options. Figure 2.10 displays a query specified in the Get Data dialog, which is included in the Source step.
Figure 2.10: Providing a user-defined native SQL query
Any transformations applied after this native query will use local system resources. Therefore, the general implication for query development with native or user-defined SQL queries is that if they are used, try to include all required transformations (that is, joins and derived columns), or use them to utilize an important feature of the source database that is not being utilized by the folded query, such as an index.
Some other things to keep in mind regarding Query Folding are the following:
Table.Buffer
function can be used to avoid query folding. The table output of this function is loaded into local memory, and transformations against it will remain local.Change the font size
Change margin width
Change background colour