-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

The Tableau Workshop
By :

Quite often, the data that you're using will be stored as separate tables for efficiency purposes. There might be some fields that are common between tables and can be used to join the data sources together.
For example, suppose you, as a bank loan manager, would like to evaluate the best-suited customer profiles for granting a loan. Here, based on the customer-provided information, such as salary details and work experience, you would also need to access their financial history information, such as previous loans, outstanding loans, or any defaults. This kind of information can be fetched from their Experian score using the customer PAN as common information between the various data sources. This is how joins are commonly used in a lot of daily scenarios. You will learn about these joins and the different types in Tableau.
Tableau offers four types of joins, which are listed as follows:
Figure 2.9: Inner Join Between Tables A and B
Figure 2.10: Left Join Between Tables A and B
Figure 2.11: Right Join Between Tables A and B
Consider the following tables. Here, when you join table A and table B using an inner join, only the common values will be a part of the resulting table:
Figure 2.12: Full Outer Join Between Tables A and B
Figure 2.13: Union Between Tables A and B
You will learn more about these join types in detail in the following exercises.
As an analyst, you might come across scenarios in which you need to display the common records between two tables. This exercise aims to show how to join two different sheets into a single data source in Tableau.
You will join the Orders
table with the People
table using an inner join. By doing so, you will be able to identify the customer records present in the People
table along with the order information from the Orders
table, which will help you to understand customers' buying preferences.
Perform the following steps to complete this exercise:
Sample – Superstore
dataset into your Tableau instance as you did in Exercise 2.01.Orders
table first, followed by the People
table, from the Sheets
area to the Drag Sheets here
area. Alternatively, to add these sheets, you can double-click on them, and they will be added automatically to the canvas area. Tableau will auto-join the two tables using an inner join, as shown in the following figure:Figure 2.14: Data joining using an inner join
Join
symbol to open the Join
menu:Figure 2.15: Inner join properties
Note the various ways to join data. By default, Tableau performs an inner join on the common field names:
Figure 2.16: Various join options
Note
These instructions and images are based on Tableau version 2020.1. If you are using a later version of Tableau, such as 2021.4, this process may look quite different and even require an extra step. You can find additional guidance for this at the following URL: https://help.tableau.com/v2021.4/pro/desktop/en-gb/datasource_relationships_learnmorepage.htm
Orders
and People
tables, join on Customer Name
from Orders
and Person
from People
. First, de-select Region
, which is auto-selected by Tableau. To do this, click on Region
and select Customer Name
from the dropdown, as you can see in the following figure:Figure 2.17: Changing the join column
Figure 2.18: Final result of the inner join
People
table and select Person
as the joining column. Your joined columns should be as follows:Figure 2.19: Data preview of the Order and People tables
Now it's time to validate the results. This can be observed in the data grid screen in the bottom section.
You can see that you get only 58
rows in the joined dataset. Here, only the values from the Orders
table's Customer Name
column that match with values from the People
table's Person
column will be returned in the final dataset. Since the Person
table has only four values, only those values from the Customer Name
column that match these four are returned from the Orders
table.
In this exercise, you used inner join and analyzed the results returned by using this join type. Next, you will learn about the left join type.
In this exercise, you will join the Orders
table with the People
table in a left join. The objective of the left join is to verify how much customer information is present in the People
table. This will help identify and update the People
table so that you can expand the customer database, to drive better sales:
Figure 2.20: Join screen for the Orders and People tables
Orders
and People
tables to the canvas. Once done, you should see the join options, as follows:Left
: Figure 2.21: Selecting the Left join
People
table, Person
and Region
. Use the Sort
icon to sort the values, as highlighted in the following figure:Figure 2.22: Analyzing the left join results
Customer
names do not match any values in the Person
column.Figure 2.23: Nulls in the join result
You will observe that the rows where a match is not found are replaced by a null
value, which means the Person
table does not contain information for these customers. This means that you can add this customer information to the People
table to improve the data quality.
In this exercise, you learned how to perform a left join and how data is matched between the two tables. Next, you will learn about the right join type.
In this exercise, you will join the Orders
table with the People
table in a right join. Consider a scenario wherein the People
table consists of all the customers who have previously bought your company's products, and you want to fetch a complete list of the products a customer has bought, using information from the Orders
table. This will help you understand the buying habits of customers based on their past purchases.
The steps to complete this exercise are as follows:
Orders
and People
tables similar to how you did in the previous exercises so that you can see the following on your screen:
Figure 2.24: Join screen for the Orders and People tables
Right
join, as shown in the following figure:Figure 2.25: Selecting the Right join
Person
and Region
columns from the People
table. Use the Sort
icon to sort the values, as highlighted in the following figure:Figure 2.26: Analyzing the right join results
You will observe that the rows from the People
table contain information about customers with past orders. This can now help you to analyze what products a person tends to buy often, and accordingly, you can suggest similar products to them, for a better-targeted sales strategy.
In this exercise, you performed a right join on two tables and saw how to use the right join results to analyze data. Next, you will learn about a full outer join.
A full outer join would combine the results of both the joining tables into a single dataset. To do that in Tableau, you can use the join properties and change the join type to Full Outer
.
Figure 2.27: Selecting the Full Outer join
The next thing to cover is the union operation. In a union, the new table will be appended below the previous table in the final dataset. Usually, unions are used when you want to combine datasets with a common structure of columns. For example, order information for 2021 can be combined using a union with the order information for 2020 to get a unified dataset.
In the next exercise, you will learn how to implement a union in Tableau.
Consider a scenario related to a large retailer such as Walmart or Amazon, operating in multiple regions. In such a case, it makes more sense to store the data at the regional level so that it can contain products customized to that specific region. If you were to compare how the different regions perform among each other, you would need to combine these different data sources into one. This is where the concept of a union comes into play.
In this exercise, you will use the Orders
table, which is split by region. The files for different regions follow a similar column structure as the Orders
table but are segregated into different sheets based on their regions, as you can see from the following figure:
Figure 2.28: Input data for the Orders table preview stored as different tabs
You have the data for two regions: Central
and West
. You can implement a union to combine these two regions into a single dataset, as outlined in the following steps:
Union
Excel file using the Connect
option from the location where the files are saved, as done for the previous exercises. Once the file is imported, you should see the following screen:
Figure 2.29: Orders table for the Central and West regions
New Union
option to open the Union
popup, as shown in the following figure:Figure 2.30: New Union popup
Union
popup, as follows:Figure 2.31: Adding tables in a union
OK
to add the union to the data grid. You can now preview the data in the bottom section. Tableau will combine the data from both tables into a single data source.
Figure 2.32: Union data preview
Sheet
and Table Name
. Sheet
signifies which Excel file sheet this data belongs to and Table Name
refers to the table names in Tableau. This can be used to quickly identify which columns come from which sheets and tables.Figure 2.33: Table identification columns in the union result
In this exercise, you learned how to perform a union of multiple data sources.
In all the preceding exercises, you joined on only two data sources. It is possible to add more than two data sources. You will just need to specify in the join connection how the tables join to each other.
Figure 2.34: Joining with more than two tables
The preceding figure shows an example join on the Orders
table with the People
and Returns
tables. If there were a common field between the Returns
and People
tables, you could also join these two tables as per your requirements.
This completes the various ways you can join multiple tables in Tableau and concludes the discussion on the various ways to combine data from multiple sources together. The following sections will deal with preparing your data for your desired task.
Change the font size
Change margin width
Change background colour