-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Azure Data Factory Cookbook
By :

In this recipe, we shall demonstrate the power and versatility of ADF by performing a common task: importing data from several files (blobs) from a storage container into tables in Azure SQL Database. We shall create a pipeline, define datasets, and use a Copy
activity to tie all the pieces together and transfer the data. We shall also see how easy it is to back up data with a quick modification to the pipeline.
In this recipe, we shall be using most of the services that were mentioned in the Technical requirements section of this chapter. Make sure that you have access to Azure SQL Database (with the AzureSQLDatabase
instance we created) and the Azure storage account with the necessary .csv
files already uploaded.
First, open your Azure Data Factory instance in the Azure portal and go to the Author and Monitor interface. Here, we shall define the datasets for input files and database tables, along with the linked services (for Azure Blob Storage and Azure SQL Database):
AzureSQLDatabase
.adforchestrationstorage
storage account:Figure 2.2: The New linked service blade
OrchestrationAzureBlobStorage1
).
Figure 2.3: Connection configurations for Azure Blob Storage
Select the appropriate subscription and enter the name of your storage account (where you store the .csv
files):
NOTE
In this recipe, we are using Account Key authentication to access our storage account, primarily for the sake of simplicity. However, in your work environment, it is recommended to authenticate using Managed Identity, taking advantage of the Azure Active Directory service. This is more secure and allows you to avoid using credentials in your code. You can review the references for more information about using Managed Identity with Azure Data Factory in the See also section of this recipe.
AzureSQLDatabase
:
Figure 2.4: Connection properties for Azure SQL Database
Azure SQL
into the search field to find it easily.AzureSQLDatabase
) from the dropdown in the Database Name section.Now, we shall create two datasets, one for each linked service.
Figure 2.5: Create a new dataset
CsvData
and select OrchestrationAzureBlobStorage in the Linked Service dropdown.Figure 2.6: Dataset properties
AzureSQLTables
.AzureSQLTables
dataset:tableName
:
Figure 2.7: Parameterizing the dataset
dbo
as the schema and @dataset().tableName
in the table text field, as shown in the following screenshot:
Figure 2.8: Specifying a value for the dataset parameter
CsvData
dataset:filename
.filename
. This will generate the correct code to refer to the dataset’s filename
parameter in the dynamic content text box:
Figure 2.9: Dynamic content interface
Click on the Finish button to finalize your choice.
Verify that you can see both datasets on the Datasets tab:
Figure 2.10: Datasets resource in the Author tab of Data Factory
In the Author tab, create a new pipeline. Change its name to pl_orchestration_recipe_1
.
Figure 2.11: Pipeline canvas with a Copy activity
CsvData
dataset and specify countries.csv
in the filename
textbox.AzureSQLTables
dataset and specify Country
in the tableName text field.NOTE
You will learn more about using the debug capabilities of Azure Data Factory in Chapter 9, Managing Deployment Processes with Azure DevOps. In this recipe, we introduce you to the Output pane, which will help you understand the design and function of this pipeline.
Figure 2.12: Debug output
After your pipeline has run, you should see that the dbo.Country
table in your Azure SQL database has been populated with the countries data:
Figure 2.13: Contents of the Country table in Azure SQL Database
We have copied the contents of the Countries.csv
file into the database. In the next steps, we shall demonstrate how parameterizing the datasets gives us the flexibility to define which file we want to copy and which SQL table we want as the destination without redesigning the pipeline.
airlines.csv
for the filename in the Source tab and Airline
for the table name in the Sink tab. Run your pipeline again (in Debug mode), and you should see that the second table is populated with the data – using the same pipeline!.csv
files. We can easily enhance the existing pipeline to accomplish this.Backup Copy Activity
, and configure it in the following way:AzureSQLDatabase
for the linked service, and add Airline
in the text box for the table name.CsvData
as the linked service, and enter the following formula into the filename
textbox: @concat('Airlines-', utcnow(), '.backup' )
.Figure 2.14: Adding backup functionality to the pipeline
Let’s look at how this works!
In this recipe, we became familiar with all the major components of an Azure Data Factory pipeline: linked services, datasets, and activities:
Every pipeline that you design will have those components.
In step 1 and step 2, we created the linked services to connect to Azure Blob Storage and Azure SQL Database. Then, in step 3 and step 4, we created datasets that connected to those linked services and referred to specific files or tables. We created parameters that represented the data we referred to in step 5 and step 6, and this allowed us to change which files we wanted to load into tables without creating additional pipelines. In the remaining steps, we worked with instances of the Copy activity, specifying the inputs and outputs (sources and sinks) for the data.
We used a built-in function for generating UTC timestamps in step 12. Data Factory provides many convenient built-in functions and expressions, as well as system variables, for your use. To see them, click on Backup SQL Data activity in your pipeline and go to the Source tab below it. Put your cursor inside the tableName text field.
You will see an Add dynamic content link appear underneath. Click on it, and you will see the Add dynamic content blade:
Figure 2.15: Data Factory functions and system variables
This blade lists many useful functions and system variables to explore. We will use some of them in later recipes.
Microsoft keeps extensive documentation on Data Factory. For a more detailed explanation of the concepts used in this recipe, refer to the following pages:
Change the font size
Change margin width
Change background colour